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Heavy-Traffic Approximations for Service 
Systems With Blocking 


By W. WHITT* 
(Manuscript received October 27,1983) 


This paper develops approximations for the blocking probability and related 
congestion measures in service systems with s servers, r extra waiting spaces, 
blocked customers lost, and independent and identically distributed service 
times that are independent of a general stationary arrival process (the 
G/GI/s/r model). The approximations are expressed in terms of the normal 
distribution and the peakedness of the arrival process. They are obtained by 
applying previous heavy-traffic limit theorems and a conditioning heuristic. 
There are interesting connections to Hayward’s approximation, generalized 
peakedness, asymptotic expansions for the Erlang loss function, the normal- 
distribution method, and bounds for the blocking probability. For the case of 
no extra waiting space, a renewal arrival process and an exponential service- 
time distribution (the GI/M/s/0 model), a heavy-traffic local limit theorem 
by A. A. Borovkov implies that the blocking depends on the arrival process 
only through the first two moments of the renewal interval as the offered load 
increases. Moreover, in this situation Hayward’s approximation is asymptot- 
ically correct. 


I. INTRODUCTION AND SUMMARY 


In this paper we introduce and investigate approximations for 
congestion measures in G/GI/s/r service systems, which have s servers, 
r extra waiting spaces, the first-come first-served discipline, and 
independent and identically distributed (i.i.d.) service times with a 
general distribution that are independent of a general stationary 
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arrival process. Customers arriving when all s servers are busy and all 
r waiting spaces are full are blocked; they leave without receiving 
service and without affecting future arrivals (no retrials). We primarily 
focus on the case r = 0 (except for Section VII). We present approxi- 
mate expressions for the proportion of arriving customers that are 
blocked (call congestion) and the proportion of time that the system 
is full (time congestion). We also approximate the distributions of the 
number of customers in the system at arrival epochs and at arbitrary 
times. 

We obtain our approximations by applying previous heavy-traffic 
limit theorems!“ and a conditioning heuristic (Section III). As with 
much of the earlier work on this problem, we are not able to present 
a completely rigorous development, but we believe that we have a 
novel perspective that provides additional insight. There are interest- 
ing connections to earlier work, including Hayward’s approximation,° 
generalized peakedness,® asymptotic expansions for the Erlang loss 
function,’® the normal distribution method,!*"* and bounds for the 
blocking probability." 

Perhaps our most important contribution is to point out the signif- 
icance of a heavy-traffic local limit theorem by Borovkov [Theorem 
15(2) of Ref. 2], which was first published in Russian in 1972. (“Local” 
means that the limit is for the probability mass function instead of 
the cumulative distribution function.) For GI/M/s/0 models (no extra 
waiting space, renewal arrival process, and exponential service-time 
distribution), this theorem provides a rigorous basis for the approxi- 
mations under heavy loads. For example, this theorem implies that 
Hayward’s approximation [(22) in Section 6.2] is asymptotically cor- 
rect as the offered load increases. Of course, this property is consistent 
with extensive numerical evidence, but apparently no mathematical 
proof has been given before. 

Here is how this paper is organized. In Section II we review a heavy- 
traffic limit theorem for G/GI/ models that we will apply, which is 
also due to Borovkov.’ In Section III we introduce a conditioning 
heuristic and apply it with the limit theorem in Section II to obtain 
an approximation for the distribution of the number of busy servers 
at an arbitrary time in the associated G/GI/s/0 system. In Section IV 
we use a conservation law plus the approximation in Section III to 
generate an approximation for the blocking probability in G/GI/s/0 
systems. We also discuss an approximation for the distribution of the 
number of busy servers at arrival epochs. In Section V we state 
Borovkov’s local heavy-traffic limit theorem for GI/M/s/0 models that 
supports the approximations. We also make several conjectures about 
related theorems. 

In Section VI we discuss connections to other work. We indicate 
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that the normal approximation for the blocking probability in 
M/M/s/0 systems has long history, going all the way back to Erlang.” 
In the Appendix we also give a simple proof of the heavy-traffic local 
limit theorem for M/M/s/0 systems, using the elementary central 
limit theorem and Stirling’s formula.'® In Section VI we also discuss 
connections to Hayward’s approximation,”® bounds for the blocking 
probability,'°*” and previous normal approximations.?° “ 

In Section VII we indicate how the approach can be extended to 
systems with finite waiting rooms, drawing on Halfin and Whitt.* In 
doing so, we obtain a modification of Hayward’s approximation for 
the case of a finite waiting room [see (42)]. Finally, in Section VIII we 
give the results of experiments testing the approximations for 
G/M/s/0 systems. As observed by Rahko,!°-? Hertzberg,’® and Del- 
brouck,"* the normal approximation tends to work quite well except 
in low loads. 

We close this introduction by noting that the general blocking 
problem discussed here continues to generate considerable attention; 
several related papers were presented at the Tenth International 
Teletraffic Congress at Montreal.®'””°*4 Another recent related con- 
tribution is Newell.” 


Il. THE INFINITE-SERVER MODEL IN HEAVY TRAFFIC 


Consider the G/GI/o service system, which has infinitely many 
servers and independent and identically distributed service times that 
are independent of a general stationary arrival process. Let A(t) count 
the number of arrivals in the interval [0, t] for t = 0. We assume that 
A(t) satisfies a central limit theorem; i.e., 


[A(t) — dé]/(Ac7t)?? > N(O, 1) (1) 


as t 0, where —» denotes convergence in distribution, N(a, b) denotes 
a random variable with the normal distribution having mean a and 
variance b, and 2 is the arrival rate. When A(t) is a renewal process, 
c” in (1) is the squared coefficient of variation (variance divided by 
the square of the mean) of the renewal interval. More generally, the 
denominator in (1) typically is asymptotically equivalent to the stand- 
ard deviation of A(t), so that 


c? = lim var[A(t)]/At = lim var[A(t)]/EA(t). (2) 


The parameter c” in (1) and (2) is the basis for approximating the 
arrival process by a renewal process via the asymptotic method in Ref. 
26. 

Let »' be the mean and G(t) the cumulative distribution function 
(cdf) of the service-time distribution. Let a = \/p be the offered load. 
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Our approximations are developed by considering limits as the offered 
load a increases. We fix the service-time cdf G(t) and change a by 
changing X. 

Let X, be the equilibrium number of busy servers in the G/GI/o 
system at an arbitrary time, as a function of a. (We assume that a 
unique equilibrium distribution exists—see Section 2.3 of Franken et 
al.2’) Borovkov' proved the following heavy-traffic limit theorem for 
X,,. He showed that if (1) holds [actually a slightly stronger functional 
limit theorem for A(¢)], then 


(X,. — a)/Vaz > N(O, 1) (3) 
as a — 0%, where 
z=1+(c?—-1)n (4) 
and 
n= Ul i [1 — G(t)/dt; (5) 


also see Refs. 3 and 6. Actually, Borovkov did not directly treat the 
equilibrium variable X,, so that there is a further interchange of limits 
to get (3). For practical purposes, we can regard (8) as a consequence 
of (1). 

To interpret (3) through (5), recall that EX, = a for all a, even with 
a general stationary arrival process. The parameter z in (4) is the 
asymptotic peakedness of the arrival process A(t) with respect to the 
service-time cdf G(t) because 


z = lim var(X,)/EX, = lim var(X,)/a; (6) 
see Ref. 6 and references there. In particular, z in (4) is zg(0+) in 
Section 3.3.1 of Ref. 6. Note that 7 in (5) has the maximum value of 1 
attained by a deterministic service-time distribution (unit mass on 
uw!) and can assume any value in the interval (0, 1]. For example, 7 = 
2/3, 1/2, and 1/n, respectively, when the distribution is uniform, 
exponential and concentrated on two points with mass n™ on 0. 

We have defined three parameters characterizing variability: c”, n, 
and z. The parameter c” in (1) and (2) is a measure of the variability 
of the arrival process (over large time intervals). The parameter 7 in 
(5) is a measure of the variability of the service-time distribution and 
the parameter z in (4) is a second measure of variability of the arrival 
process (as measured by the G/GI/o model with service-time cdf G 
having variability parameter 7 in heavy traffic). The heavy-traffic 
peakedness z in (4) is increasing in c”, but whether it is increasing or 
decreasing in 7 depends on the sign of c” — 1. For a Poisson process, 
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c? = 1. In general, increased service-time variability as expressed by 
decreasing 7 in (5) tends to make z in (4) closer to one. Hence, if the 
arrival process is more variable or bursty than a Poisson process in 
the sense that c? > 1, then z in (4) is increasing in 7, which means 
that the peakedness increases as the variability of the service-time 
distribution decreases. When c? > 1, the deterministic service-time cdf 
gives the largest heavy-traffic peakedness among all service-time cdf’s 
with the same mean. This phenomenon seems to have been first 
discussed by Wolff?’ (see also Section 3.3.1 of Ref. 6). Related results 
about GI/G/1 queues are contained in Whitt.” 


Ill. THE CONDITIONING HEURISTIC 


We now use the heavy-traffic limit theorem (3) for the G/GI/« 
system to produce approximations for the associated G/GI/s/0 loss 
system, which has s servers, no extra waiting space, and the same 
arrival process and service-time distribution. Our starting point is a 
basic property of the Markovian M/M/s/0 models in which the number 
of customers in the system evolves as a birth-and-death process. The 
equilibrium probability p,(s,) that there are k customers in an 
M/M/s,/0 model is just the conditional probability that there are k 
customers in an M/M/s,/0 model given that there are no more than 
Ss; customers, for any s2 with s; S so S ™; in other words, p,(s;) is 
obtained by truncating and renormalizing p;(sz): 


Px(s1) = Pr(S2) y pj(s2) (7) 
= 


for0 <k<s, Ss. Truncation formula (7) is an immediate consequence 
of the known formula for p;(s), but also is easily derived for more 
general birth-and-death processes (see p. 68 of Heyman and Sobel).°° 

Let Y, be the equilibrium number of busy servers at an arbitrary 
time in the G/GI/s/0 model. (We also assume existence and unique- 
ness—see Section 2.3.2 of Ref. 27.) As an approximation, we assume 
that Y, is related to X, of Section II by the same conditioning formula 
(7). In particular, we suggest the heuristic approximation 


P(Y, = k) = P(X. = k)/P(Xa = 8) (8) 


for 0 < k ss. This conditioning approximation is no doubt an old 
idea, which would be hard to trace; it was used previously by Jagerman 
to develop approximate blocking formulas in the case of nonstationary 
Poisson traffic.*" 

Next. we obtain a further approximation by invoking the heavy- 
traffic limit theorem for X, in (3). For this purpose, let &(t) be the 
standard normal cdf, i.e., of N(0, 1), and ¢(t) the associated density. 
We combine (3) and (8) to obtain 
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P[k — 0.5 s N(a, az) Ss k + 0.5] 
P[N(a, az) Ss + 0.5] 


k-—at0.5 k-—a-—0.5 
o( ie )-a( Ve 
= s—-at0.5 
o( We 
k-«@ s-—-at0.5 
anlaae (“)/+(=FE™) 


The time congestion, say Br, is defined as By = P(Y, = s), so that 
an approximation for it is obtained from (9) simply by setting k = s. 

In further support of the conditioning heuristic, we note that it is 
also valid for the diffusion process limits that arise in several cases of 
heavy traffic, e.g., for the stochastic-process version of (3) in the case 
of exponential service-time distributions (see Ref. 1 and 3). 

We have applied the conditioning heuristic to the distributions at 
an arbitrary time instead of at arrival epochs. Since Poisson arrivals 
see time averages,””°°*? these two distributions are the same for 
M/M/s/0 systems. Also, the heavy-traffic limits for these distributions 
are the same for G/GI/o system. (See Ref. 3 for the renewal arrival 
process case.) However, these distributions are definitely not the same 
for the G/GI/s/r system or even the GI/M/s/0 special case. The 
conditioning heuristic seems to perform much better when applied to 
the distribution at an arbitrary time, as will be clear from the next 
two sections. This might not be surprising, but a good explanation is 
still needed. 


P(Y, = k) = 


v 


IV. A CONSERVATION LAW AND THE BLOCKING APPROXIMATION 


Let Bc be the probability that an arriving customer in the 
G/GI/s/0 system is blocked (call congestion). A basic conservation 
law enables us to express Bc in terms of EY,. In particular, since the 
average rate of accepted arrivals equals the average departure rate, 
not counting lost calls (see Heyman’), 


A(1 — Bc) = pEY,. (10) 


Hence, we can combine (9) and (10) to obtain an approximation for 
Be. 
It is well known and easy to show that 


E[N(O, 1)|N(0, 1) = 6] = —$(4)/#(6), (11) 
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so that 





EY, = a — Vaze (=) } b (= (12) 


and 








Bo=1- a7 EY, = Vz/ad (=) / 7 (=) = 2Br (13) 
a. 


(e.g., see Appendix A of Delbrouck"). 

We thus suggest (9) as an approximation for P(Y, = k), (9) with 
k=-s for an approximation for By, and (13) as an approximation for 
Be. 

Let Z, be the equilibrium number of busy servers in the G/GI/s/0 
system at arrival epochs, again as a function of the offered load a. For 
G/M/s/0 systems (exponential service-time distributions, but still 
general stationary arrival process), we have the exact relationship 


P(Z, =k —-—1) = kP(Y. = k)/a (14) 


for 1 < k<s (p. 113 of Franken et al.”’), which is a refinement of (10), 
because (10) is obtained from (14) by simply summing over k. We thus 
propose (14) as an approximation for G/M/s/0 systems and also 
G/GI/s/0 systems. Improved approximations for the nonexponential 
service-time distribution case can perhaps be obtained from the rela- 
tionships in Section 4.3.2 of Ref. 27, but this does not appear easy. 

These approximations are expressed in terms of the asymptotic 
peakedness z in (4). However, if this parameter is not available, other 
expressions for the peakedness can be used instead.® For example, the 
general formula for the peakedness of a renewal arrival process with 
respect to an exponential service-time distribution is given here in 
(27). 

Formulas (4) and (5) can also be used to calculate a revised peaked- 
ness if there is a change in the service-time distribution, as suggested 
in Section 6 of Ref. 33 and as has been done in practice. Suppose that 
z, has been previously determined based on the service-time cdf G, 
but now we are going to consider the same G/GI/s/0 system with new 
service-time cdf G,. For this purpose let n; = (G;) in (5). Using (4), 
we obtain an approximation for c”, namely, 


2=1+4 (4, -1)/m. (15) 
We can thus approximate 2, based on (4), (5) and 
22 = 1+ (c? — 1)y2 = 1 + (21 — 1)m2o/m. (16) 


More general transformations based on Mellin transforms have also 
been developed by Jagerman for the entire peakedness functionals 
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z[F] considered in Ref. 6 to describe the effect of changing the service- 
time distribution.** 

Approximation formulas (9), (13), and (14) can also be used to 
generate approximate multiplicative correction factors to be used with 
the exact M/M/s/0 formulas. For example, we can obtain our approx- 
imation by multiplying the exact blocking probability for the 
M/M/s/0 system by the ratio Bc(z)/Bc(1), where Bc(z) is Bc in (13) 
as a function of z in (4). This procedure is slightly more complicated, 
but it is exact for M/M/s/0 systems. 


V. BOROVKOV’S HEAVY-TRAFFIC LOCAL LIMIT THEOREM 


A rigorous justification of (13) is provided by Theorem 15 (2), p. 
226, of Borovkov.? For the GI/M/s/0 model, Borovkov established 
that 


lim VaBc = Vzo(8/V2)/®(8/Vz) (17) 
if (s — a)/Va —> B as a —> &, where z = (c” + 1)/2 as specified in (4) 
in the case of an exponential service-time distribution. 

Borovkov identifies z in (17) as (c? + 1)/2 rather than the heavy- 
traffic peakedness in (4), but we believe (4) is appropriate for gener- 
alizations to nonexponential service-time distributions and nonre- 
newal arrival processes. 

Borovkov’s arguments can also be applied to yield a related local 
limit theorem for P(Z, = k) in GI/M/s/0 systems, namely, 

lim VaP(Z, = k) = V26(6'/Vz)/®(6/Vz) (18) 
if (s — a)/Va > B and (k — a)/Va > B’ < B as a—> ©, We can then 
apply (14) to deduce that 

lim VaP(Y, = k) = (1/Vaz)$(8’/Vz)/®(8/vz) (19) 


aD 
and 


lim VaBr = (1/Vaz)o(8/V2)/(8/Vz2) (20) 
under the same asymptotic conditions. The limit in (19) coincides 
with Theorem 16, p. 232, of Borovkov,’? which is stated there without 
proof. 

Hence, we have theoretical justification for all our approximations 
in the case of GI/M/s/0 systems. We conjecture that (17) through (20) 
are still valid for general stationary arrival processes and nonexponen- 
tial service-time distributions, i.e., under the conditions for Borovkov’s 
G/GI/ theorem (3). Since (14) is valid for general nonrenewal arrival 
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processes but not for general service-time distributions, the conjecture 
seems much more likely to be true for the generalization of the arrival 
process than for the generalization of the service-time distribution. 


VI. CONNECTIONS TO OTHER WORK 
6.1 Beginning with Erlang 


For the Markovian M/M/s/0 model as well as the M/G/s/0 model, 
the exact blocking probability is given by the classic Erlang loss (B) 
formula 


B(s, a) = cps) / 3 (a"/k!). (21) 
h=0 


Since Poisson arrivals see time averages,””*°*? Bc = Br and (18) and 
(19) coincide in this case. In this case, approximation (9) reduces to 
a rather standard normal approximation which was evidently known 
in 1924 by Erlang.’® The M/M/s/0 case of the limit theorem (17) 
plus various asymptotic expansion improvements were given by 
Brockmeyer’ and Vaulot.? Asymptotic expansion improvements also 
follow from Theorem 14 of Jagerman® and are discussed on p. 88 of 
Delbrouck.* A simple proof of the M/M/s/0 version of (17) using the 
central limit theorem and Stirling’s formula’® is given here in the 
Appendix. 


6.2 The Hayward approximation 


A relatively simple approximation for the blocking probability Bc 
in G/M/s/0 and even G/G/s/0 systems as a function of s, a and z 
proposed by Hayward is 


Bc _ Be(s, Q,; 2) =~ Be(s/z, a/z, 1) aa B(s/z, a/z) (22) 


using (21) (see Fredericks’ and Eckberg‘). It is significant that the 
approximation (9) and the limit theorem (17) are consistent with (22); 
for those expressions, B(s, a, z) = B(s/z, a/z, 1). 

There are several possible interpretations and applications. We can 
interpret (17) through (20) as additional evidence in support of the 
Hayward approximation. The limit theorems and approximations 
provide additional evidence that the Hayward approximation should 
perform well under heavy loads. The Hayward approximation is par- 
ticularly appealing, given that there are convenient computer programs 
to calculate B(s, a) in (21) extended to nonintegral s, as have been 
developed by Jagerman.” 

On the other hand, we can use the Hayward approximation as an 
additional justification for (9). We can derive (9) by applying the limit 
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theorem for M/M/s/0 systems described in Section 6.1 together with 
the Hayward approximation. 

The connection also suggests that improved approximations can be 
obtained for G/M/s/0 systems and possibly G/G/s/0 systems by 
applying the asymptotic expansions of Brockmeyer,’ Vaulot,® or 
Jagerman’® for B(s, a) after applying the Hayward approximation to 
treat the peakedness z. In particular, formula (5) in Jagerman,’ which 
is based on Theorem 14 there, seems particularly promising in com- 
bination with Hayward’s approximation. However, testing remains to 
be done. Of course, improved approximations would also be obtained 
from asymptotic expansions related to (17). This seems to be a 
promising direction of research. 


6.3 Bounds for the blocking probability 


Sobel’® and Heyman” recently established a lower bound for the 
blocking probability in a G/G/s/r system when p > 1, namely, 


Bo2z1- rae (23) 


and observed that the lower bound often is a good approximation 
when p > 1. 

We partly explain why the lower bound is a good approximation 
by showing that it appears in limiting lower and upper bounds as 
a — © in our heavy-traffic approximation (9) for G/G/s/0 systems. 
(This paper is a revised version of Ref. 11 in Sobel,’® where our result 
is mentioned.) 

We use the familiar bounds for the tail of a normal distribution 


(x1 — x 3)h(x) <1 — B(x) = B(—x) < x1 A(x), x>0; (24) 
see p. 175 of Feller.!? To make the connection to (9) and (17), let x = 
(a — s)/Vaz. Since xVz/a = (1 — p7), and 


x(x) 


sep ee (25) 


1s 


by (24), from (9) we obtain the approximate bounds as x > 
(l-—p")<sBes (1-p")1- x)", (26) 


which are useful for p = 1. The distance between the bounds goes to 
zero as x —> %, 

Holtzman also establishes bounds for Bc in the GI/M/s/0 system.” 
In fact, he described the range of all possible values of Bc given only 
the offered load a and the peakedness z’ of a renewal process, which 
is 


z’ = [1 — (nu) — a, (27) 
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where 


b(n) = i e“dF(t) (28) 


with F(t) the cdf of an interarrival time. By (17), we know that the 
range approaches 0 as a > © with (s — a)/Va — B. The 2’ in (27) 
approaches z in (4) and, by (9) and (17), Bc depends only on a, s and 
z for large a. 


6.4 The normal-distribution method 


The approximations here for P(Y, = k), Br, and Bc in (9), (13), 
and (14) coincide with the normal-distribution method (NDM) of 
Rahko’® and Hertzberg’? and the normal approximation for the 
Bernoulli-Poisson-Pascal (BPP) approximation of Delbrouck,’* but 
the analysis here is different. 


6.5 Light-traffic approximations and interpolations 


The approximations here have been developed by considering the 
service systems in heavy traffic, i.e., as the offered load increases. 
Improved approximations for lighter loads may be possible by consid- 
ering the service systems in light traffic, i.e., as the offered load 
decreases. Better approximations might be obtained by making inter- 
polations between light and heavy traffic. This seems to be another 
promising direction for future research. Previous work on light-traffic 
approximations for queues is contained in Bloomfield and Cox,*® 
Newell,®” and Burman and Smith.*®*? Interpolations between light and 
heavy traffic have been considered by Burman and Smith® and 
Reiman and Simon.*° The hybrid approximations for queues with 
superposition arrival processes developed by Albin*’ and used in Ref. 
42 to approximate networks of queues are also in this spirit. 


VII. FINITE WAITING ROOMS 


Corresponding approximations can be developed for G/GI/s/r sys- 
tems with r extra waiting spaces. We can apply heavy-traffic limit 
theorems for G/GI/s/ systems (see Ref. 4), together with the condi- 
tioning heuristic of Section III. The conditioning relationship (7) is 
also valid for M/M/s/r systems with different values of r. As Ref. 4 
describes, there are several possible heavy-traffic limit theorems to 
apply. For GI/M/s/» systems, we suggest the heavy-traffic limit 
theorems in Ref. 4 with (s — a)/Va — B as a > & or, equivalently, 
(1 — p)Vs — B, where a = X/u and p = a/s. This leads to a promising 
analog of Hayward’s approximation (22) for the case of a finite waiting 
room, namely, (42) below. 
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From Section 4 of Ref. 4, it is evident that the extension to 
nonexponential service times is more difficult when r > 0, but we 
conjecture that the GI/M/s/ results in Ref. 4 extend to G/M/s/ 
systems (nonrenewal arrival processes) and that the conditioning 
heuristic is valid in heavy traffic for G/M/s/r systems. In support of 
this, the conservation relationships (10) and (14) extend to G/M/s/r 
systems. (The factor k on the right side of (14) is replaced by 
min{k, s}.) The corresponding heavy-traffic local limit theorem for 
M/M/s/r systems is easy to prove using the methods of Ref. 4 or the 
Appendix. We conjecture that heavy-traffic local limit theorems cor- 
responding to (17) through (20) are also valid for GI/M/s/r systems 
as well as for the more general G/M/s/r systems. Moreover, we 
conjecture that the form of the limits will coincide with what we get 
by applying the conditioning heuristic to the GI/M/s/ limits in Ref. 
4, 


In this section, let X, and Y, be the equilibrium number of customers 
in G/M/s/ and G/M/s/r systems, respectively, at an arbitrary time. 
For a large with (1 — p)Vs = 8, the approximations derived from Ref. 
4, where the limit theorem is proved only for renewal arrival processes 
(see Propositions 1 and 2 and Theorems 1 and 4), are 


P(X, = 8) = y = [1 + 6'8(8’)/H(8’)T", (29) 
P(X, >st+r|X,28) 2% n= errs, (30) 
P(Y, 2s) = ER (1 — eh) /(1 — eh), (31) 
VsP(Y, = k| Ya Ss) ¥ $(8’ + 6)/8(6’), (32) 
VsP(Y. = k| Ya = 8) © (yB'/é)e*’, (33) 
and 
VsBr = Ble BINS (34) 


for (k — s)/Vs = 6 and a’ = B/z with z being the peakedness in (4). 
Since 


E(min{X,, s}) = a = sp (35) 
for all G/G/s/~ systems, e.g., by (4.2.3) of Ref. 27, 


sP(s <= X, = str) +sp — sP(X, 2 s) 


H(niny Fa 8)) P(X, <s+n) 


= [s(y — yn) + s(p — y)]/(. — yn) 
= s(p — yn)/(1 — yn), (36) 
for y and » in (29) and (30), so that by (10) 
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Be = 1 — a E(min{Y,, s}) = 1 — p™(p — yn)/(1 — yn) 
= (1 — p)yn/p(1 — yn) (37) 
and 


VsBc = Byn/p(1/ yn) © (2/p)VsBr (38) 


so that Bc = zBras in (13), although the correction with p in (38) may 
be useful. 
Also note that 


n = n(p, 8, 7, 2) = n(p, 7, 2) © n(p, r/2, 1) (39) 
and 
1 = (0, 8, 7, 2) = y(p; 8, 2) © y(p, 8/2”, 1), (40) 
so that 
Bc = Be(p, s, r, 2) * Be(p, s/z*, r/z, 1) (41) 


or, equivalently, 
Be = Bela, s, r, 2) ® Be(a/2’, 8/2”, r/z, 1) (42) 


in the manner of Hayward’s approximation in Section 6.2. If we 
express Bc in terms of 1 — p, then we have the alternate expression 


Be = Be(1 — p, $7, 2) ~ Be[(1 a p)/2, $, 7, uP (43) 


We can achieve (42) by fixing s, r and » and then changing X so that 
(1 — p)/z is unchanged while z is replaced by 1. As with Hayward’s 
approximation, we can use M/M/s/r formulas when z = 1. 

Note that the normalizations in (41) through (438) are not the same 
as in Hayward’s approximation in Section 6.2. Comparing (42) with 
(22), we see that now a and s are divided by z? in (42) instead of z, so 
that evidently the peakedness has a much greater impact when there 
is a finite waiting room. This might be expected because now the 
boundary where losses occur is further away from the center of mass, 
with p required to be less than one. The different normalization of r 
and s might be expected because this approximation is based on r 
being of order Vs. 

Of course, the approximations developed in this section need to be 
tested, which has not yet been done. From the theory, we know that 
the modification of Hayward’s approximation for finite waiting rooms 
in (42) should work well when p is high but less than one, s is large, 
and r is of order Vs, but it remains to determine the actual range over 
which the approximation is good. 
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VIII. TESTING THE BLOCKING APPROXIMATION 


In this section we present numerical comparisons to test the ap- 
proximation for the approximate blocking probability Bc in (13). Our 
testing here is confined to G/M/s/0 systems. Since formula (13) 
coincides with the normal approximations of Rahko’®™ and Del- 
brouck,!* their comparisons are relevant too. 

Tables I through IV compare the heavy-traffic approximation (13) 
with exact M/M/s/0 results and the equivalent random method. We 
selected seven different blocking probabilities: 0.001, 0.01, 0.05, 0.10, 
0.20, 0.40, and 0.60. The higher numbers were selected so that we 
could test the lower bound in (23), which is only applicable when a > 
s. We also selected four different (z, s)-pairs: (1, 5), (1, 50), (2, 50), 
and (10, 400). In each case, we used the charts on pages 23-32 of 
Wilkinson*® to determine the corresponding load in Erlangs dictated 
by the equivalent random method for the given blocking probabilities 
and parameters s and z. When the peakedness was not 1 (Tables III 
and IV), we also calculated Hayward’s approximation (22). For Hay- 
ward’s approximation we often used the more detailed graphs in 
Appendix A of Cooper.** (The results were also checked using Jager- 
man’s computer programs.*’) In parentheses next to the lower bound 
is the upper bound obtained from (26). It should be noted that the 
upper bound in (26) is an upper bound for our normal approximation, 
not necessarily on the true blocking probability. 

In interpreting the tables, remember that the equivalent random 
method is only an approximation too when the arrival process is not 
Poisson. Moreover, from Holtzman’ we know that the range of 
possible blocking probabilities consistent with the partial information 
provided by the peakedness can be quite wide. (As we indicated in 
Section 6.3, this is not true in heavy traffic.) Hence, there often is 
little reason to prefer the numerical accuracy of exact calculations 
according to the Erlang loss formula (21) over the normal approxi- 
mation (18). 


Table |—The blocking probability Bc(s, a, z) for z = 1 and 


s=5 
Heavy-Traffic 
Load in Erlangs Blocking Approximation Bound (23) 
a Probability (13) 1-—p'forp>1 
0.77 0.001 0.000 
1.37 0.01 0.003 
2:22 0.05 0.05 
2.86 0.10 0.11 
4.0 0.20 0.23 
6.6 0.40 0.39 0.24 (0.33) 
11.5 0.60 0.49 0.57 (0.77) 


Note: In parentheses to the right of the lower bound (28) is the 
approximate upper bound in (26). 
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Table II—The blocking probability Bc(s, a, z) for z= 1 and 


s = 50 
Heavy-Traffic 
Load in Erlangs Blocking Approximation Bound (23) 
a Probability (13) 1-p'forp>1 
32.5 0.001 0.001 
38.0 0.01 0.01 
44.5 0.05 0.05 
49.7 0.10 0.10 
59.0 0.20 0.21 0.15 (0.16) 
82.0 0.40 0.42 0.39 (0.40) 
122.0 0.60 0.59 0.59 (0.61) 


Note: In parentheses to the right of the lower bound (23) is the 
approximate upper bound in (26). 


As we expected, the quality of the approximation, as measured 
against the exact formula when z = 1 or the equivalent random method 
when z ¥ 1, improves as a or s increases and z decreases. The parameter 
a/z gives a good indication of the quality to be expected; i.e., the 
quality depends approximately on a/z and improves as a/z increases. 
The heavy-traffic approximation tends to degrade as the number of 
servers gets beyond two or three standard deviations (Vaz) away from 
the G/GI/oo mean (the load a). 

Table V presents some of Kuczura’s* results (as displayed in his 
Figures 1-3) for GI+M/M/s/0 systems (having an arrival process that 
is a superposition of a renewal process and a Poisson process) together 
with our heavy-traffic approximation. A significant feature of these 
systems is that the blocking experienced by the customers in the 
different streams is not the same. Since the arrival rates in the two 
streams are identical in each case, the blocking experienced by an 
arbitrary customer, which is what our approximations are for, is the 
average of the blocking probabilities associated with the separate 
streams. 


Table III—The blocking probability Bc(s, a, z) for z= 2 and s = 50 
Approximate 
Load in Blocking Prob- Hayward’s Heavy-Traffic Bound (23) 
Erlangs ability (Equiv. Approximation Approximation 1-— p™' forp> 
a Rand.) B(25, a/2) (13) 1 
26.5 0.001 0.001 0.001 
32.6 0.01 0.01 0.01 
40.2 0.05 0.05 0.06 
45.8 0.10 0.10 0.11 
55.5 0.20 0.20 0.21 0.10 (0.10) 
78.9 0.40 0.40 0.38 0.37 (0.38) 
120.0 0.60 0.59 0.58 0.58 (0.61) 


Note: In parentheses to the right of the lower bound (23) is the approximate upper 
bound in (26). 
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Table 1V—The blocking probability Bc(s, a, z) for z= 10 and s = 400 


Approximate 
Load in Blocking Prob- Hayward’s Heavy-Traffic Bound (23) 
Erlangs ability (Equiv. Approximation Approximation 1-— 7! forp> 


a Rand.) B(40, a/10) (13) 1 

256 0.001 0.002 0.001 

290 0.01 0.01 0.01 

340 0.05 0.05 0.05 

385 0.10 0.10 0.10 

460 0.20 0.20 0.22 0.13 (0.14) 
640 0.40 0.40 0.41 0.38 (0.39) 
900 0.60 0.56 0.56 0.56 (0.57) 


Note: In parentheses to the right of the lower bound (23) is the approximate upper 
bound in (26). 


To obtain the heavy-traffic approximation, it is necessary to specify 
the peakedness of the arrival process. When the arrival process is the 
superposition of independent renewal processes, at least one of which 
is not Poisson, the superposition process is not a renewal process (see 
Ref. 26 and its references). However, the peakedness of the superpo- 
sition process is clearly the convex combination of the individual 
peakedness values. Suppose there are n independent streams with }; 
the arrival rate and z; the peakedness of stream 1. Then clearly 


Table V—The blocking probability for a Gl4M/M/s/0 
model with s = 25: comparison with Kuczura*® 


Load in Erlangs, a 


System 20 25 30 
M/M/s/0 
Arbitrary arrival 0.050 0.144 0.245 
Heavy traffic 0.054 0.148 0.229 
D + M/M/s/0 
Poisson arrival 0.045 0.15 0.28 
Renewal arrival 0.027 0.11 0.20 
Arbitrary arrival 0.036 0.13 0.24 
Heavy traffic (z = 0.75) 0.037 0.124 0.214 
Hayward (z = 0.75) 0.035 0.126 0.232 
GI + M/M/s/0 [2(G) = 2] 
Poisson arrival 0.058 0.14 0.21 
Renewal arrival 0.094 0.20 0.31 
Arbitrary arrival 0.076 0.17 0.26 
Heavy traffic (z = 1.5) 0.084 0.186 0.274 
Hayward (z = 1.5) 0.077 0.172 0.268 
GI + M/M/s/0 [2(G) = 3] 
Poisson arrival 0.06 0.14 0.20 
Renewal arrival 0.15 0.26 0.38 
Arbitrary arrival 0.115 0.20 0.29 
Heavy traffic (z = 2) 0.12 0.21 0.30 
Hayward (z = 2) 0.101 0.195 0.286 
Equivalent random method (z = 2) 0.105 0.200 0.285 


Note: All entries except the heavy-traffic, equivalent-random method, 
and Hayward values are from Kuczura.* 
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wr 


A=); and » Aizi/d. (44) 
i=1 


i=1 


We use (44) to obtain the peakedness values for the heavy-traffic 
approximation given in Table V. For the case in which the peakedness 
of the superposition process is z = 2, we also compared the blocking 
probabilities with those obtained using the equivalent random method, 
Chart 2 on page 24 of Wilkinson,*® and Hayward’s approximation 
using Jagerman’s program.* The results seem to be good, about the 
same as those in Tables I through IV. 
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APPENDIX 
A Heavy-Traffic Local Limit Theorem for the Erlang Loss Formula 


For the elementary M/G/s/0 system, where the blocking probability 
is given by the Erlang loss formula (21), approximations (9) and (13) 
follow easily from well-known limit theorems for the Poisson distri- 
bution. We present one possible argument here. 

Let p(k, \) denote the probability mass function of the Poisson 
distribution with mean \, i.e., 


p(k, \) = eR. (45) 
The Erlang loss formula then can be expressed as 
B(s, a) = pls, a) / D plk, a). (46) 


As before, let ¢(x) and ®(x) be the density and cdf, respectively, of a 
standard normal random variable, N(0, 1). 


Theorem: In an M/G/s/0 system, 
lim VaB([a + sVa], a) = $(s)/®(s), 


ae 


where [x] is the greatest integer less than or equal to x. 


Proof: Since a Poisson random variable with mean ) has the same 
distribution as the sum of n i.i.d Poisson random variables with mean 
d/n, the central limit theorem can be applied to obtain 


[a+sVa] 
lim > p(k, a) = &s) 
. ano ~=_ k=) 
(see Problem 9, p. 194, and Example X(c), p. 245, of Feller!®). Hence, 
it remains to show that 


lim Vap[(a + sVa), a] = 6(s) = (2a) e782, 


ao 


We can establish this result using Stirling’s formula (p. 52 of Feller’®). 


As in Stirling’s formula, let the symbol ~ below mean that the ratio 
of the two sides tends to 1 as a > ©. We have 


Vae Malet) 
(a + sVa)! 
Ve" (a 'atsva)) 


V2m(a + sVar) (a + sVer)M2e-let9ve) 
esva 


© VOn(1 + s/Va)“(1 + s/Vay™ 


e-*” esva 


“ADE OLS? 


Vap[(a + sVa), a] = 
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where 


log(e*¥(1 + s/Va)~*) =sva-—a log(1 + s/Va) 


Va Va 
Hence, 
es¥“(1 + s/Va)* ~ e%? 
and 
Vapl(a + sVa), a] ~ (20)-Me7"?? 
as claimed. 
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We model the arrival of calls at a switch where they are assigned to any 
one of the available idle outgoing links. A call is blocked if all links are busy. 
It may be lost after assignment to an idle link with a probability that depends 
on the link. For the case of Poisson arrivals and exponential holding times, 
all show that the distribution of passage time to the blocked state is independ- 
ent of the assignment policy. As a consequence, the blocking probability and 
the law of the overflow traffic are also independent of the assignment policy. 


I. INTRODUCTION 


Telephone calls arrive at a switching center in a Poisson stream of 
rate A. When a call arrives, the switch, in accordance with a pre- 
specified policy, assigns the call to one of the idle outgoing links. If 
the call is assigned to the idle link 1, there is a probability 1 — e; that 
the call is unsuccessful, and probability «; > 0 that it is successful. An 
unsuccessful call is immediately lost and link i remains idle. If the call 
is successful, link 1 immediately becomes busy and remains in that 
state for a holding or conversation time, which is an exponentially 
distributed random time with mean 1/y. At the end of this holding 
time link 7 returns to the idle state again. 

The preceding paragraph describes a model that approximates the 
behavior of a single node in a telephone network. In an actual network, 
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for a call to be placed successfully a circuit or path of idle links must 
be established between the call’s originating node and the destination 
node. The node where the call arrives assigns it to one of its idle links, 
say, link i. As the call proceeds from node to node towards its 
destination, it may encounter a node all of whose outgoing links are 
busy, in which case the call is lost. The probability that this may 
happen is 1 — ¢;. This probability depends on the entire path that the 
call follows. It is assumed here, however, that the probability depends 
only on the initial link i. Such a link-by-link analysis is a common 
approximation in traffic theory.! 

Suppose the model has n outgoing links indexed i € Q := {1, 
2, ---,n}. The state of the links or of the switching node is the subset 
ICQ of idle links. So the state space is 2° and there are 2” states. The 
state where there is no idle link, namely state ¢, is the blocked state. 
In a manner specified precisely in the next section, each switching 
policy u defines an irreducible Markov chain on this state space. Let 
t,(I) be the first time that, in equilibrium, the chain starting in state 
I reaches the blocked state ¢, and let p, be the time to return to ¢ 
after leaving ¢. 

In this paper we show the surprising fact that the distributions of 
Tu(I) and p, are independent of the policy u. Several consequences 
follow from this: First, the blocking probability p,(¢), an important 
performance measure, is independent of u. Second, the process of 
blocked calls* has a law that does not depend on the policy u either. 
Hence, in designing a switching policy one need not worry about 
blocked calls since they cannot be affected by the policy anyway. The 
overflow process, which consists of both lost and blocked calls, does 
of course depend on the policy. 

This paper is organized as follows. Section II presents the Markov 
chain description. Section III gives the calculation of the distribution 
of the passage time 7,(/). An algorithm is given to evaluate these 
distributions. Section IV contains two examples. For the special case 
where there are only two possible values for the loss probabilities «,, 
the algorithm simplifies. For this case a formula for the blocking 
probability is conjectured. The formula extends the well-known Erlang 
formula and it has been verified for many examples. However, a proof 
of its correctness is not yet available. 


Il. THE MARKOV CHAIN DESCRIPTION 


A policy u prescribes, for each state J, the probability with which an 
arriving call is assigned to an idle link i € J. Thus u is specified by an 
array u = {u(/,1),1 € I ¥ o} such that u(/, i) = 0, ¥; ul, i) = 1. 

Each u defines a transition rate matrix R, = {R(U, J), 1C Q,J CQ} 
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whose nonzero elements are 
RLU, J — t) = dud, 1), LE. 
RLG,I+j)=4, Jl 
RL, I) = - > Au, t) — (n — |Z] )u, (1) 


where ); := 6A, J-—i1:= I\i, 1 +7 := 1 U {7}, and |J| is the cardinality 
of I. The first expression in (1) gives the rate with which an idle link 
becomes busy, the second gives the rate with which a busy link becomes 
idle, the third gives the diagonal term R,(/, J) = Yyx; Ru, J). For 
later reference observe that the column of R,, corresponding to the 
blocked state has elements 


R.(¢, ¢) = —np, R.({7}, $) = Ni 
RU, ¢) =0 for |Z] >1. (2) 


Since each ¢; > 0 by assumption, it follows that the chain defined 
by R, is irreducible and possesses a unique invariant positive proba- 
bility measure {p,(J), I C Q}. In particular, p,(¢) is the equilibrium 
blocking probability. It will be shown to be independent of u. This is 
surprising in view of the fact that, if the ¢; are all different, then for 
every I ¥ ¢, p.(J) depends on u. 

Suppose the chain starts in state J ¥ ¢ at time 0 and let 7,,() denote 
the first time the chain reaches ¢. The distribution of 7,(I) can be 
obtained as follows. Let R, be the rate matrix obtained from R,, by 
making ¢ an absorbing state: 


R.(¢,j)=0, RJ) =RA,J), Io. (3) 


Let e, = {e,(J), J C Q} denote the (column) vector whose only nonzero 
component is e,(J) = 1. Let p; = { p;,(J)} be the solution of 


piJ) = 2 pAK)Ru(K, J), Po=er. (4) 


Then one has 
Prob{7.(Z) s ¢} = p.(¢), 


i.e., D:(@) is the cumulative distribution function of the first passage 
time 7,(/). From (4) it follows that the Laplace transform F,(J, s) of 


p:(@) is given by 
FL, s) = ef(s F— R,)"e5, (5) 


where _# is the identity matrix. 
The next section is devoted to the main result: 
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Theorem 1: (s_¥ — R,)~e, does not depend on u. 


The theorem has several consequences. First, it shows that the 
distribution of the passage time 7,(J) does not depend on wu. Second, 
suppose the chain corresponding to R,, leaves state ¢ at time 0-. Then 
at time 0 it enters one of the states {1}, --- , {n}, each with probability 
1/n. Let p, be the first time that the chain returns to ¢. The Laplace 
transform G,,(s) of Prob{o, < t} is simply 


Gu(s) = 


= 


Y Fullil, 8) 


which is independent of u. In particular, the expected value p of p, is 
independent of u. The expected holding time in state ¢ is 1/ny, and 
so the blocking probability 


Puld) = Lee 


is also independent of u. Finally, the process of blocked calls can be 
described by a sequence of independent intervals S,, T,, So, To, ---. 
Each S; has the same distribution as p,(@), and during this interval 
there is no blocked call. Each T; has the same distribution as the 
holding time in state ¢, and during this interval all arriving calls are 
blocked. Thus the statistics of the blocked call process are not affected 
by u. 

Ill. CALCULATION OF (s_7— R,)~'e, 


Throughout this section a fixed policy is considered and so the suffix 
u is dropped. 
Using (1) through (3) write 


R =A, + A, + B, 
with the nonzero elements of these matrices being 
A), I — i) = d;ud, 1), 1E€] and Il-i#¢ 
AT, I) = — ¥ Aw, i), l#¢ 


ie] 
A,Ud,I+j)=4, JE and I#¢ 
A,Z, [I)=—-(n-|I|)u, IF ¢ 
B({i}, 6) = R({i}, ) = rz 
Then 
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Aj(¢, -) = A\(-, 6) = A,(¢, -) = A,(-, ¢) = 0, (6) 
and for any vector x = {x(JI)}, 
(Ayx)(I):= VAM, S)x(WJ)= Y Aw, i)x(I—i) 
j ier 
I-i#6 


TF x NUCL, t)x(I), (7) 
iel 
(Ayx)(1) =u Xe +J)— (n= |T|)ux), I#¢ 


=0, IJ=¢: (8) 


First one calculates R7e,, d = 0. 
From (6) 


V1 Re, = Be, (9) 


so its only nonzero components are v;({i}) = ;. In particular, since 
vi(¢) = 0, it follows that 


R%e, = (A, + A,)* "v1, d=1. (10) 


The appendix introduces vectors wg and zz, d = 1. [See eqs. (16) 
through (19) and (24) through (26).] Define vectors va by 


va(I) = (-1)!4! 4 wa (J), lsdsn+1. (11) 


It can be checked that v, given by (9) and (11) are the same. 
Next one evaluates A)vg. 


Lemma 1: Vae1 = A,Va, lsdsn (12) 
Vat+i = — »y Zr(Q)Vndi—r (13) 
r=1 


Proof: Since wa(¢) = va(¢) = 0 and since A,(-, ¢) = 0, therefore, 
trivially, 


Vat1($) = (Ayva)($) = 0. 
Now suppose J = {i}. From (7) we get 
(Ayva)({t}) = —Aival{t}) 
= —);(-1)*wa({i}) 
‘ = (—1)* wasi({i}), by Lemma 3, 
= vasi({i}). 


Next suppose |J| = 2. From (7) 
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(Ayva)(I) = ¥} Aw, i)va — i) — DY AWC, 1)valT) 
ier iel 


= ¥ ul, i)(-1)""*4[AywaI) + Awall — 1)] 


iel 
= (=) b ull, D pwanld. by Lemma 3, 
iel 


7 (—1)7!¥4wasi(1) = Vaiil(J). 
Finally, 
Vnti(l) = (-1)4°wri(D) 


—(-1)"1*" } (—1)'z,(Q)Wrsi_(Z), by Lemma 15 
r=1 


ay y Z(Q)Vn41-r(L). O 


Next one evaluates A,va. Define 


GN. 
ieg 
d-1 
Lemma 2: A,Wa = —p ¥ (—1)'q-Va-r — p(n — d)va, d2=1. (14) 
r=1 


Proof: From (7), and for J # 4, 
(A,va)(Z) = pw » vall +7) — (n — |Z |)pvalz) 


ele 2 wall + J) + (n — Twat] 


Jj 


p(—1)I+8 F pa > Ajwa-r(1) + 5) » jwa-r(Z) 


JGI r=1 r=1 jel 


+ (n — d)wa(I)], by Lemmas 4 and 6, 


d—-1 
w(—1) 4 5 qrWa-(I) + (n — wat] 


d-1 
-|s (—1)’9-Va--Z) + (n — va} O 
These two lemmas yield the next corollary. 
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Corollary 1: For each d = 1, there exist coefficients ag, --+ , Qaaan not 
depending on u such that 


es dAn 
R%e, = > aaeVe, 
e=1 
where d A n := min(d, n). 
Proof: This is apparent from eqs. (10) through (14) and the fact that 
the vectors Wa, Zz, and q, do not depend on u. 
Corollary 1 implies the existence of coefficients ao(s), --- , an(s) 
such that 


(s_ 4% —- R)~e, = ap@y + A1V1 + +++ + OnVn. 
These coefficients can be readily calculated. Multiplying both sides by 
(sF = R), 
ey = ao(sF% — R)es + ¥ ag(s_% — R)va 
1 


n n n 


aySeg — aoV1 + Y agsVa — Y agAyva — Y agA,Va 
1 1 1 


n n 


apse, + ») (ags — Qa-1)Va — Qn y Zn+1-d(2)Va 
1 1 


n d-1 n 


+u Yaa XY (-'Grva-- + wp Y aa(n — d)va. 
1 r=1 1 
Rearranging terms and equating the coefficients of ey, vi, ---, Vn 
gives 
ao = oo. 
n—d 
aq = [s + (n — d)plog + uw YY (—-1)'Greasr 
r=1 
— Zn+i-d(Q)an, lsden. 


The last n equations are in “triangular” form and can be solved 
recursively to yield 


Qn-r = Q,(s) Qn, 


where Q,(s) is a monic polynomial in s of degree r. This gives 


ao = s = Q,,.(s) an. 
Ss 


Hence 


CALL BLOCKING 715 


(s 4% — R) Te, = Zs [Q,-1(s)vi + +--+ + Qi(s)Vn-1 + Vn. 


a (s) 
Since the right-hand side does not depend on u, this proves Theorem 
1, announced in the previous section. 


IV. EXAMPLES 


Suppose the n links are grouped into two “trunks,” trunk 1 with n, 
and trunk 2 with nz links, ny + nz = n. The loss probability of every 
link in trunk 7 is the same, namely, 1 — ¢;. In this special case one may 
construct a Markov chain with state (i, t2), where 0 <1; = n; and0 < 
ly <S Ng are the number of idle links in trunks 1 and 2, respectively. 
The number of states reduces to (nm; + 1)(n2 + 1). (Of course, if links 
within the same trunk are distinguished, then 2” states are needed as 
before.) This reduction carries over to the argument in the appendix, 
and so one can define the “generating” vectors Wg, Vg with components 
indexed by the reduced state (i, ic). The recursive definition of wg can 
be explicitly solved to obtain 

d-ig 


Wa(is, lz) = y B(k, i:)B(d — k, iz)AAE*, 
k=iy 
where 
k-i a eee 
B(h, i) = oe Bis i) 
j=0 p= J 


Also, as before, 


Vall, 12) = (—1)972** wa (in, l2), Liss ds n 
n 
Vnsi(i1, iz) = —Y Z(m, N2)Vns1-r(i, te), 
r=1 


where 


d 
Za(li, to) = Y C(k, u)C(d — k, is)ATAZ™*, 
k=0 


Cie @ 


Let p,(%1, 2) denote the probability that in equilibrium 1;(i2) links in 
trunk 1(2) are idle. This probability generally depends on u. However, 
Do := p.{0, 0), the blocking probability, is independent of the assign- 
ment policy. Calculating this explicitly in terms of the parameters ),, 
de, #, and for small values of n, n2 suggests the general formula: 
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Table |—Blocking probabilities 


Policy Pi P2 Pb 
n, =9,n.=9 uy 0.178 0.82 x 1073 0.502 
y=10 U2 0.84 x 103 0.122 0.502 
nm, = 9,n2= 19 uy 0.498 0.807 x 10-3 0.63 x 1073 
y = 20 Up 0.14 x 10°3 0.44 x 107? 0.63 x 107° 


Notation: y = Aw™, pi = Di, PCO, iz), D2 = Yi, Pl, 0) 


nytng d 


pe = XY Y v@ d)yr'ya*™, (15) 


d=0 i=0 
where y; := Aju ', i = 1, 2. The coefficients y are given by the recursion 
y(i, 0) = 1, all i 
y(t, d) = 0, i>d or i<0 
yi, d +1) = (my —i + 1)yWi — 1, d) + (m2 — d + i)y(i, d). 


The formula (15) reduces to the Erlang loss formula when 4; = yo. It 
is true for all y1, yo and n; + nz S 3, and it has been verified for many 
particular cases. However, a proof of correctness is not available. 

Consider now two numerical examples. In both «, = 0.81, ¢. = 0.7. 
Let u,, respectively uz, denote the “overflow” policy that assigns every 
call to trunk 1, respectively trunk 2, if it has an idle link. The table 
below shows that the probability of blocking a single trunk does vary 
considerably with policy, but the probability of blocking both trunks 
does not vary. 
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APPENDIX 
For d = 1 define vectors wa = {wa(J), J C Q} as follows: 
wa(l) = 0, |\I| >d (16) 
wall) = I Xi, |\I| =d (17) 
ier 
Wa(I) = Awa (J) + AWa-all — 1), (18) 
where L:= min{j|j € J}, 0<|I| <d, 
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wad) = 0. (19) 
Lemma 3: wa(I) = AjWe-i(J) + AjWa-i — J), 
JEL IcqQ, d=>2. (20) 


Proof: By direct verification we see that (20) is true for d = 2. Suppose 
it is true for d < e, and consider d = e + 1. If |I| > e + 1, (20) is 
trivial. 

Suppose |J| =e + 1. Then, from (17), 


Weri(l) =I Nie 
ie] 


On the other hand, if j € J, then 
Ajwe(L) + Ajwell —j) =O+ vj II Ni = I] Vi 
ieI-j 


ier 


and so (20) again holds. 
Finally, suppose 0 < |I| <e + 1, and let i = min{j|j7 € J}. Then by 
(18), 


Wer) = Azwe(I) + Azwe(I — i). (21) 
If j = 1 then (20) again holds. Suppose j # 1. By induction hypothesis, 
we(I) = AyWe-i(I) + AyWe-ill — J) 
wl — i) = AjWe-ilT — i) + Ajwe-i(I — i — Jj). 
Substitution in (21) gives 
Wer) = AA We-r) + we-i(I — j) + Well — 1) 
+ well — i —j)] 
= AAiwe-i(1) + Aiwea(l — 1)] + A\Diweall — J) 
+ \GWe-i(I — j — 1)] 
= Ajwe(I) + Awe — J), by induction hypothesis. O 
Lemma 4: wall +7) = . NWa- TL), JEL. 


Proof: From (20) 
wall +j) = AjWa-1(T +j) + AjWa-1(Z). 


The assertion follows by iterating on the first term on the right. O 
Let N(I, d) denote the set of all ||-tuples n = {n;|i € I} such that 
n;= 1 and Yin; = d. 


Lemma 5: wal) = YY TA 


nEN(I,d) ie 
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Proof: If d = 1, the assertion is immediate. Suppose it is true for some 
d= 1. From (20) andj EI 


Wail) = Aj[wal) + wall — 7)] 
= Y AWA+ Y dA; WM. 


nEN(,d) iel n€N(-j,d) iel-j 
The first term is the sum over all n € NU, d + 1) with n; > 1, while 
the second term is the sum over all n € N(J, d + 1) with n; = 1. Hence 
the assertion is true for d + 1. C) 


d-1 
Lemma 6: (d ~ |I|)wa(I) = YY A¥*w,(). 


r=1 jel 


Proof: By Lemma 5, the right-hand side equals 


d~—1 
YON IL NP (22) 
r=1 jEInEN(L,r) ie] 
Each term in (22) is of the form 
II A" (23) 


iel 
for some m € N(I, d). Hence the assertion will be proved if it is shown 
that each of the terms (23) appears exactly (d — |J|) times in (22). 
Fix m. Then (23) appears in the sum (22) 
Y APT TL A 
nEN(Lr) iel 
if and only if m; > d — r, or r > d — m,. Therefore, the term (23) 
appears in (22) exactly )\jer (mj — 1) = d — || times, as required. O 
Next, for 1 < d <n define vectors {z,(J), J € Q} as follows: 


“a(I) = 2, ri (24) 

ieI 
zqa(I) = 0, 0s |I|<d (25) 
Za + j) = AjZa-r(l) + za(Z). (26) 


Lemma 7: Letd=|J| andI Cd. Then fore =1 


d 


War+ell) 7 2; (—1)'z,(J) Wa+e-r(I) = 0. 


r=1 


Proof: If I = @ the assertion is trivial by (19). We first prove the 
assertion for J = J ¥ ¢ using induction on d. If d= 1 and I = {i}, then 


Wi+e(Z) = Aiwell), by (20) 
= 2,([)w(I), — by (24). 
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Suppose the assertion is true for 1, ---, d. Let |J| = d andj € I. 
Then by (20), 


Wa+t+e—r(L + J) = jWate—r(T +) = AjWate—r(1). 


Multiplying both sides by z,(J) for r= 1 and summing for r= 0, --- , 
d + 1 gives 


Warite(l + J) — AyWare(T + J) 


d+1 


+ » (—1)'2,(J)[Wa+i+e-r0 + J) — AjWare-r(l + J)] 


d+1 


= Aware) + YL (-1)'Z-(J)ware-(D)] 


r=1 


d 
AjlWa+e(L) + x (—1)'Z,(J) Wa+e-r(I)], 


since Zg+i(J) = 0 by (25) 
= 0 by induction hypothesis. 


‘Hence, 
d+1 


0= Was+i+ell ng J) _ AjWase(L a J) ci » (—1)’[z,(J) 


r=2 
ay NjZr—1(J ) | Was1+e—-T + J) — Z1(J)Wase(l + J) 
= (-1) Za41(J)Aywe-ilI + j) 


d+1 
= Wasti+ell +7) + »y (-1)'Z(J + J) Wasite-- +7), 
r=1 


using (26) and the fact that Za+1(J) = 0 by (25). 


This completes the proof for the case when |J| = |J|. For the case 
|I| < || the proof proceeds as in the last step using induction on 
|| for fixed J. | O 
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The theory of vector quantization (VQ) of linear predictive coding (LPC) 
coefficients has established a wide variety of techniques for quantizing LPC 
spectral shape to minimize overall spectral distortion. Such vector quantizers 
have been widely used in the areas of speech coding and speech recognition. 
The conventional vector quantizer utilizes only spectral shape information 
and essentially disregards the energy or gain term associated with the optimal 
LPC fit to the signal being modeled. In this paper we present a method of 
incorporating LPC spectral shape and energy into the code-book entries of 
the vector quantizer. To do this, we postulate a distortion measure for 
comparing two LPC vectors that uses the weighted sum of an LPC shape 
distortion and a log energy distortion. Based on this combined distortion 
measure, we have designed and studied vector quantizers of several sizes for 
use in isolated word speech recognition experiments. We found that a fairly 
significant correlation exists between LPC shape and signal energy. Hence, 
an LPC shape combined with energy vector quantizer with a given distortion 
requires far fewer code-book entries than one in which LPC shape and energy 
are quantized separately. Based on isolated word recognition tests on both a 
10-digit and a 129-word airlines vocabulary, we found improvements in rec- 
ognition accuracy by using the VQ with both LPC shape and energy over that 
obtained using a VQ with LPC shape alone. 
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I. INTRODUCTION 


The idea of quantizing linear predictive coding (LPC) coefficient 
sets using a vector quantizer (VQ), rather than a scalar quantizer, has 
been studied for several years.'° Previously developed VQ algorithms 
have been widely used with good sucess in the areas of speech coding 
and isolated word recognition.” ® 

The “standard” VQ algorithm quantizes the spectral shape of the 
LPC vector to one of M* code-book entries, where M* represents the 
number of LPC prototype vectors needed to span the space of LPC 
vectors with a given distortion criterion. This type of vector quantizer 
disregards the gain or energy associated with the LPC vector; instead 
it codes only the spectral shape. For LPC vocoder applications the 
gain of the signal is generally coded independently of the LPC spectral 
shape. This effectively assumes that spectral shape and signal gain 
are independent of each other. For standard implementations of iso- 
lated word recognition systems, signal gain information generally has 
not been used.&?! 

Recently, Brown and Rabiner’ improved the performance of an 
LPC dynamic time-warping (DTW) word recognition system by in- 
corporating gain information into the conventional distortion measure. 
Their results indicated a substantial reduction in word error rates on 
a moderate-size vocabulary of words (129 words) used in an airlines 
reservation and information system. 

In this paper we extend the work of Brown and Rabiner and show 
how gain information can be incorporated into the vector quantizer 
design algorithm to yield a set of code-book entries with both spectral 
shape and gain information. We show that since a nonzero correlation 
exists between spectral shape and gain, the number of code-book 
entries required to obtain prescribed levels of distortion for both 
spectral shape and gain is significantly fewer than would be necessary 
to determine the same distortion levels using separate code books for 
spectral shape and gain. We use the newly designed code books in an 
isolated word speech recognition system based on the theory of hidden 
Markov models (HMM) and demonstrate that this system’s perform- 
ance surpasses one employing LPC shape code books of the same size. 

The organization of this paper is as follows. In Section II we briefly 
review the standard LPC VQ design algorithm. In Section III we 
demonstrate how we implemented the VQ design procedure using both 
LPC shape and energy. In Section IV, we describe results of several 
isolated word recognition tests using the combined LPC shape and 
energy VQ. In Section V we summarize the results. 


Il. REVIEW OF THE LPC SHAPE VQ DESIGN ALGORITHM 
Assume we are given a section of a speech signal, s(n) n= 0,1, ---, 
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N-1, with z-transform S(z). From this we derive the pth-order LPC 
model, S(z), of the form 


i G 
1G) 4 (1) 
1—-— ¥ ajz™ 
i=1 
where a’ = {1, a, G2, +--+ , ap} is the optimal pth-order LPC model and 


G is the model gain. It is readily shown that G can be written in the 


form 
G= \/ > e(n) = Va’Va, (2) 


where e(n) is the error between the true speech samples s(n) and the 
predicted speech samples s(n), (i.e., those obtained from the model) 
and V is the Toeplitz autocorrelation matrix of the actual speech 
signal, with the first row given by 


Vim) = XY s(n) s(n+m), m=0,1,---,p. (3) 


The zeroth autocorrelation coefficient, V(0), is conventionally called 
the signal energy. 

If we want to compare two LPC models, e.g., Ar(z) and Ar(z), of 
the forms 


Gr 


A; (2) =— = (4a) 
1— ¥ afz* 
k=1 
and 
G 
A(z) = ——- (4b) 
1— ¥ ake? 
k=l 


several related LPC distortion measures (distance metrics) have been 
proposed including: 
1. The Itakura-Saito measure of the form 





arVra Gi 
dis(Ar, Ar) = eee - : + In (2) : (5) 


2. The log likelihood measure of the form 


arVrar 
arVrar 


ditr(Ar, Ar) = In ( (6) 
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3. The gain normalized measure of the form 


arVrar e : 


dgn(Ar, Ar) = ( 4 (7) 
arVrar 


It is readily seen that dppp and dgy are essentially identical when 
values of d are close to zero,’ and differ primarily for large values of 
d. Both dy. and den are independent of signal energy, since the only 
term in the expressions of eqs. (6) and (7) that depends on signal 
energy is V7, which is cancelled out because it appears in both the 
numerator and the denominator. Although the djg measure of eq. (5) 
has a signal energy dependent term (G%/G%) and contains some energy 
information, experimentation by several researchers”®”’ indicates that 
dyutp and dey are much better for designing a VQ than djs. In our own 
work,” we have used den exclusively. 

Using the distortion measure of eq. (7), one can define a distortion 
(distance) between a training LPC vector (ar) and a VQ code-book 
vector (az). An algorithm for choosing a set of M* code-book vectors, 
Ar, that minimize the distortion of a set of training vectors from the 
code-book entries can then be devised, for example, 

I 
D,(M*) = min F min [dex(ar, in| | (8) 
{ap} Pf L=1 1<m=<M* 

where we have simplified the distance notation of eq. (7) to represent 
the distance between a test and a reference LPC vector (rather than 
a test and a reference LPC model). Various iterative algorithms for 
implementing the minimization of eq. (8) have been proposed and 
work well over a wide range of conditions.’ ® The optimum code books 
(which we will call spectral shape code books) are generated by a 
method similar to the K-means algorithm. Starting with an initial 
guess of M* entries, each LPC vector of the training set is assigned to 
the closest entry. The centroids of the M* subsets (clusters) obtained 
in this manner are used as new trial entries in the code book, and the 
iteration is continued until some stopping criterion is satisfied. Gen- 
erally, initial solutions for any desired value of M* are obtained by 
first finding solutions for smaller values of M* and then splitting some 
or all of the code-book entries. This way one can start from a value of 
M* = 1, where the solution is simply the centroid of the training set 
vectors, and generate code books for higher values of M*, by either 
splitting every cluster! or one cluster at a time.” 


Hl. MODIFICATIONS OF THE DISTORTION MEASURE TO INCLUDE 
SIGNAL ENERGY 


If we denote any of the LPC shape distortions of eqs. (6) and (7) as 
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dipc(T, R), then a straightforward way of including signal energy in 
the overall distortion is to form the sum 


d(T, R) = dipc(T, R) + of (de(T, R)), (9) 


where dz(T, R) is an energy distortion, f(x) is a nonlinearity applied 
to the energy distortion, and a is a multiplicative factor on the energy 
distortions. If we define the (unnormalized) test energy as Er, and the 
(unnormalized) reference energy as Ep, then 


Er = 10 logio(Vr(0)) (10a) 
Ep = 10 logio(Va(0)). (10b) 


A normalized energy (E, Ep) can be obtained making all energy 
values relative to a local peak energy (e.g., for isolated word recognition 
we make it relative to the peak energy within a word). Thus, 


Er = Er — (Er)max, (11a) 
Ep = Ep — (Er)max;, (11b) 

and an energy distortion, dz, can then be defined as 
d,(T, R) = |Er — El. (12) 


The nonlinearity, f(x), is used to give a smaller weight to small energy 
distortions. The form we have used is 


_ Jo, |x| < CLIP 
ars: 2 |x| > CLIP, ts) 


where CLIP is a threshold chosen by appropriate experimentation. 
The use of a clipping threshold on energy distortion was proposed 
previously by Silverman and Dixon" for use in speech spectra classi- 
fication studies. 

The combined distortion measure of eq. (9) has the property that, 
as a is made small, it approaches the LPC shape distance and, as a is 
made large, it becomes proportional to the energy distortion alone. 

Given the combined distortion measure of eq. (9), the parameters a 
and CLIP must be chosen in order to implement the computation. 
Based on results in Ref. 12, the optimum value of a is expected to be 
in the range 0.1 to 0.3. Similarly, based on previous experimenta- 
tion,'*”° a reasonable value of CLIP is in the range 0 to 6 dB. 


3.1 Application of the combined distortion measure to VQ 


It is straightforward to use the combined distortion measure of eq. 
(9) in the VQ design algorithm of Section II. The resulting VQ code- 
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book vectors are then characterized by an LPC vector, along with a 
normalized log energy value. To understand some of the properties of 
these code-book vectors, a simple set of experiments was carried out 
on a set of 10,000 frames of speech derived from spoken isolated words 
of a 129-word vocabulary of airline terms. The single words were 
spoken by 100 different talkers (50 male and 50 female) over a standard 
dialed-up telephone line. 

Figure 1 shows an energy histogram of the 10,000 frames of speech. 
The peak level of the normalized energy of any frame is, by definition, 
0 dB, and a dynamic range of about 60 dB for energy can be seen in 
this figure. The first experiment used the training set to design a 
conventional LPC shape VQ [i.e., a was set to 0 in eq. (9)]. A VQ with 
M* = 16 was designed and each of the 10,000 training vectors was 
assigned to one of the M* = 16 code-book entries. After convergence 
to the best set of code-book vectors, energy histograms of each of the 
16 subsets of training vectors were made, and the results are shown in 
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Fig. 1—Energy histogram of 10,000 frames of speech from isolated words. 
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Fig. 2. If signal energy and LPC shape were totally independent, we 
would expect each of the 16 energy histograms of Fig. 2 to be essentially 
identical. This is clearly not the case, as some of the energy histograms 
are peaked near 0 dB (i.e., strong vowels), other histograms are peaked 
near —50 dB (i.e., silence or weak fricatives), and still other histograms 
peak somewhere between these upper and lower limits. 

The energy histograms of Fig. 2 indicate a fairly high degree of 
correlation between LPC spectral shape and normalized signal energy. 
Hence, one would expect that using a VQ designed from the combined 
distortion measure would be more efficient than using a separate VQ 
for LPC shape and a separate quantizer for energy. 

A second set of experiments was run on the 10,000 vector training 
set in which the average LPC distortion, dppc, was determined as a 
function of M* (the VQ size) for the case of a = 0 (no energy in the 
distortion measure). Similarly, the average distortion, dz, was deter- 
mined as a function of M* for the case of a = © (no LPC in the 
distortion measure) and CLIP = 0. The results obtained are given in 
Table I. The last column of Table I gives the value of a* where 


a*dg = dpc, (14) 
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Fig. 2—Energy histograms of the 16 code words of an M* = 16 shape LPC vector 
quantizer. 
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that is, the value of a (as a function of M*) such that the average 
distortions due to energy and LPC would be equal. The results in 
Table I show that the average LPC distortion decreases slowly as M* 
increases, whereas the average energy distortion almost halves with 
each doubling of M*. The halving of the average distortion with each 
doubling of the size of the quantizer for a scalar variable is a well- 
known effect for scalar quantizers.’® The last column in Table I shows 
that a* increases dramatically as M* increases. Hence, for small values 
of M*, a values have to be very small or the VQ essentially becomes 
an energy quantizer. For larger values of M*, the value of a is not 
overly important, since the LPC distortion dominates. 

Based on the above discussion, LPC shape combined with energy 
VQs were designed for three sets of conditions, namely: 

1. a=0.1, CLIP = 0 

2. a= 0.3, CLIP = 0 

3. a = 0.8, CLIP = 6 (dB). 

The results (dipc, dz), as functions of M*, are shown in Table II. 
For the first set of conditions, the resulting VQ achieves compromise 
values of the shape and energy distortions. For example, when M* = 
64, the LPC shape average distortion is comparable to an M* = 16 
VQ, based on LPC shape alone (see Table I), and the energy average 
distortion is comparable to an M* = 10 VQ, based on energy alone. 
When a is raised to 0.3 (the second set of conditions), the energy 
distortion is lowered at the expense of increased LPC shape distortion 
for a given M*. Thus, for M* = 64 the LPC shape average distortion 
is now comparable to an M* = 7 LPC shape VQ, and the energy 
average distortion to an M* = 18 energy VQ. When a reasonable 
clipping threshold is used (CLIP = 6), the influence of the energy 
distortion term is significantly reduced, since all vectors within CLIP 
dB of each other (with similar LPC shapes) contribute a zero energy 
distance. Hence, for M* = 64, the LPC average distortion is compa- 


Table |—Average distortions as a 
function of M* for VQs designed 
with a = 0 (column 2), and a = ©, 
CLIP = 0 (column 3), and the a*, 
which gives equal distortion 


M* dipc dr a* 
2 0.784 5.88 0.13 
4 0.579 2.97 0.19 
8 0.428 1.47 0.29 
16 0.317 0.75 0.42 
32 0.218 0.37 0.59 
64 0.196 0.19 1.0 
128 0.149 0.09 1.65 
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Table II—Average distortions, as a function of M*, for 
several combined LPC shape plus energy VQs (10,000 
training vectors) 





a=0.1 a= 0.3 a= 0.3 
CLIP = 0 CLIP=0 . CLIP = 6 

M* dipc dz direc de dipc de 

2 0.93 8.11 1.28 5.96 1.28 4.40 

4 0.73 5.22 1.18 3.22 1.20 0.99 

8 0.62 3.29 0.79 2.37 0.78 0.14 

16 0.49 2.34 0.70 1.29 0.50 0.04 

32 0.38 1.80 0.55 0.97 0.36 0.01 

64 0.31 1.30 0.45 0.69 0.26 0.01 
128 0.26 0.98 0.36 0.49 0.21 0.008 


rable to an M* = 29 LPC shape VQ, and the energy average distortion 
is comparable to an M* > 128 energy VQ. 

The trend of the results of Table II reveals that for small values of 
M*, the combined VQ tries to reduce energy distortion at the expense 
of LPC distortion, while at larger values of M*, the VQ primarily 
reduces the LPC distortion. Since energy is correlated with LPC shape 
and vice versa, a reduction in one distortion will always bring about a 
reduction in the other distortion. Figure 3 illustrates a series of energy 
histograms of the condition 2 VQ (a = 0.3, CLIP = 0) for the training 
subsets of the M* = 16 case. The effects of using energy in the 
combined distortion measure are seen in that each histogram is tight 
around some average energy for the code word. Some energies are high 
(vowel-like sounds), some are mid-range, and some are low level (weak 
fricatives or silence). When CLIP is set to some nonzero value (e.g., 6 
dB), most of the clusters will have a very low energy distortion. Since 
the spread of the energy histograms is +6 dB or less from the average 
energy for the cluster in most cases, the energy distortion term is due 
to vectors outside the cluster. 


IV. APPLICATION OF THE COMBINED VQ TO WORD RECOGNITION 


To further evaluate the effectiveness of combining energy with LPC 
shape in the VQ, a series of isolated word recognition tests was carried 
out using the hidden Markov model (HMM) recognition algorithm 
described in Ref. 9. In this system an LPC analysis of each speech 
frame is carried out, and each LPC vector is vector quantized. For 
each word in the vocabulary, an HMM is designed using a training set 
of VQ outputs for the word. In normal usage each word HMM is 
scored by means of a Viterbi algorithm that computes the probability 
of the sequence of VQ outputs from the specified word HMM. The 
word model with the highest probability score is declared to be the 
spoken word. 
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Fig. 3—Energy histograms of the 16 code words of an M* = 16 LPC shape combined 
with energy vector quantizer designed with a = 0.3 and CLIP = 0 dB. 


The standard HMM word recognizer can be trivially modified to 
work with the combined LPC and energy VQ. The only change is in 
the quantization of the training set and of the unknown test. New 
HMM models are computed for the combined VQ and the standard 
scoring algorithm is still used in the recognizer. 

The HMM recognizer using the combined LPC plus energy VQ has 
been tested on two vocabularies, namely the set of 10 digits, and a 
129-word airlines vocabulary. The results of these tests are presented 
in the next sections. 


4.1 Recognition results on digits vocabulary 


For the 10-digit vocabulary, the training set consisted of each digit 
spoken once by each of 100 different talkers (50 male, 50 female). The 
test set consisted of a separate set of 100 tokens produced in the same 
way by the same 100 talkers. The test recordings were made about one 
month after the training recordings. All words were recorded over 
dialed-up telephone lines. 

Three sets of VQ parameters were used in the recognition system, 
namely: 
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1. a = 0 (no energy) 

2. a= 0.1, CLIP = 0 

3. a= 0.1, CLIP = 6 dB. 
For each set of VQ parameters a set of HMM parameters were 
computed for each word model for the following conditions: 


* = 32, 64, 128, and 256 
N = Number of states in Markov model = 5, 8, and 10. 


The results of the recognition tests (in terms of digit error rates) are 
given in Table III. The baseline LPC shape VQ (Table IIIa) has error 
rates of from 3 to 6 percent, depending on N and M*. Generally, the 
larger the value of M*, the lower the error rate for the recognizer. No 
strong dependence on N is seen in these results. The results of Tables 
IIIb and c show about a 1-percent reduction in error rate for the 
recognizers using the LPC shape combined with the energy VQ, for 
values of M* of 128 and 256. For the smaller values of M* (32 and 
64), there is no consistent improvement in accuracy with the combined 
VQ. This result is probably due to the high distortion of the LPC 
shape quantization for these small size VQs. 

An examination of the individual errors of the recognizer yielded no 
consistent pattern of errors that were corrected by the combined VQ. 
Since the total number of digit errors was so small (on the order of 20 
out of 1000 trials), it is difficult to identify how energy information is 
helping in this system. 


4.2 Recognition results on the 129-word airlines vocabulary 


The second recognition test of the combined VQ was conducted on 


Table III—Digit error rate scores for different 
values of N and M* 
M* 
N 32 64 128 256 
(a) Digit error rates in percent for a = 0 (no energy) 
5 6.0 3.9 3.8 3.8 
8 6.1 3.9 4.1 4.0 
10 4.8 3.1 3.0 3.0 
(b) Digit error rates in percent for a = 0.1, CLIP = 0 
5 5.0 4.3 2.7 2.3 
8 5.3 4.0 2.1 2.4 
10 5.6 2.4 2.0 2.4 
(c) Digit error rates in percent for a = 0.1, CLIP = 6 dB 
5 5.7 4.5 2.8 2.9 
8 41 3.0 oe 3.2 
10 5.4 4.8 2.2 2.2 
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a 129-word airlines system vocabulary. The training set again consisted 
of each word spoken once by each of 100 talkers (50 male, 50 female) 
over dialed-up telephone lines. The test set consisted of each word 
spoken once by each of 20 new talkers (i.e., not included in the training 
set), again over dialed-up telephone lines. 

Four recognition tests were performed under the following condi- 
tions: 

1. A standard dynamic time-warping (DTW) LPC-based recognizer 
without VQ. 

2. A DTW recognizer with an LPC-shape VQ using M* = 128. 

3. An HMM recognizer with an LPC-shape VQ using M* = 256, 
with N = 10 states in each Markov model. 

4, An HMM recognizer with the combined LPC-shape and energy 

VQ using M* = 128, with N = 10 states in each Markov model. 
For each test the average word error rate y of the recognizer was 
measured as a function of the best 6 candidates. An error rate of y for 
the best 6 candidates means the correct word was not in the 8 top 
recognition choices of the system y percent of the time. Results for 
values of 8 from one (conventional word error rate) to 6 are shown in 
Fig. 4. The results show that the DTW recognizer without a VQ 
performed the best. However, the HMM recognizer with a combined 
VQ of the size M* = 128 performed almost identically to the DTW 
recognizer with an LPC shape VQ of the size M* = 128, and signifi- 
cantly better than the HMM recognizer with an LPC shape VQ of the 
size M* = 256. The results strongly suggest that energy is a powerful 
discriminator for polysyllabic vocabularies, and when used in conjunc- 
tion with an LPC shape VQ of moderate size (equivalent M*, for the 
shape of about 32) performance is better than that achieved with even 
a significantly larger VQ based on LPC shape alone. 

Even more convincing evidence for the advantages of the combined 
VQ was obtained by implementing a connected word HMM recognizer 
for Levinson’s”” airlines vocabulary and syntax for an airlines reser- 
vation task. When the combined LPC shape and energy VQ was used 
on fluent-rate sentences by six talkers in a series of informal tests of 
the connected word HMM recognizer, sentence accuracies of about 60 
percent and word accuracies greater than 90 percent were achieved. 
Applying the same size VQ with LPC shape alone, sentence accuracy 
fell to about 5 percent, and word accuracy to about 15 percent! The 
energy contour constraints in the VQ appear to provide an elementary 
form of parsing of the sentence into syllables and words. Such time- 
synchronization markers are clearly necessary for ensuring any rea- 
sonable degree of accuracy in a connected word recognition task based 
on simple isolated word models. 
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Fig. 4—Average word error rate in percent as a function of the number of best 
candidates for the airlines vocabulary for each of four recognition systems. 


V. SUMMARY 


Our results with the combined LPC shape and energy VQ indicate 
that the addition of energy directly into the VQ design algorithm 
provides an efficient method of incorporating energy constraints into 
an isolated word recognition system. Tests with isolated word recog- 
nizers indicate improved performance using the combined VQ over 
that for the LPC shape alone, provided that a sufficiently large VQ 
size, M*, is used (i.e., M* > 128). 

The incorporation of energy into the VQ design algorithm is 
straightforward. All that is needed is a modification of the distortion 
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measure to include both LPC distortion and energy distortion. The 
form we have chosen for the combined distortion measure is based on 
our knowledge of distance metrics and our experience with energy 
contours for recognition purposes. Alternative definitions of combined 
distortion measures may well provide further improvements in per- 
formance. 

Our results on the combined VQ algorithm indicate an improved 
efficiency of quantization by combining LPC shape and energy into a 
single VQ, over that obtained for separate LPC shape and energy 
quantizers. Our results could also be applied to LPC voice coding with 
reductions in the bit rate required to achieve desirable levels of 
quantization of the LPC vector and gain terms. 
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This paper describes recent developments in modeling and analysis tools 
that are basic to a generic approach to the design of network services circuits 
for voiceband applications. Providing network services is a major part of the 
effort to support the tactical needs of telecommunications users. The design 
of network services circuits is the process of choosing configurations of 
transmission facilities and equipment for realizing circuits within appropriate 
performance specifications. Techniques for design have been developed under 
the general heading of “standard designs” and, until recently, have largely 
taken their structure from the capabilities of specific equipment. 


I. INTRODUCTION 


Providing public switched services constitutes a major part of the 
telecommunications industry. These services, collectively referred to 
as plain old telephone service (POTS), are provided over the public 
switched telecommunications network to residential and business 
customers. Another segment of this industry involves the offering of 
special services. A customer service is termed a special service if it 
requires special procedures and/or equipment for design, installation, 
or maintenance as compared to POTS. Special services provide ar- 
rangements and features that are not ordinarily available in the public 
network. They are generally provided to business customers by a 
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variety of facilities such as dedicated private communication lines, 
switched-service arrangements and adjuncts to POTS. Figure 1 illus- 
trates some typical special services circuits. 

We refer to POTS and special services collectively as network 
services. The design of network services circuits is one of the major 
processes in the provisioning of these services. It involves the selection 
of proper transmission facilities as well as the selection and placement 
of necessary network equipment so that the resulting circuit meets all 
transmission and signaling requirements. In the case of special services 
circuits, the design is more complicated because, generally, such cir- 
cuits have more stringent transmission and signaling requirements 
than POTS circuits. However, they must be built from communication 
facilities often intended for POTS. 

Most network services are extended to customers’ premises over a 
network of transmission facilities referred to as the loop plant. This 
network is constructed to serve large local areas. Thus, especially in 
analog voice transmission, a large variation exists in the transmission 
characteristics of the loops.’ This further complicates the design of 
network services circuits, particularly special services circuits where 


PUBLIC SWITCHED 
NETWORK 





EO — END OFFICE (LOCAL CENTRAL OFFICE) 
FX — FOREIGN EXCHANGE 
OPS — OFF-PREMISES STATION 
PBX — PRIVATE BRANCH EXCHANGE 
PBX-CO — PBX-CENTRAL OFFICE 
WATS — WIDE AREA TELECOMMUNICATIONS SERVICE 


Fig. 1—Typical special services circuits. 
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the transmission characteristics of each link in the circuit must be 
tightly controlled so that the transmission requirements (such as loss, 
attenuation distortion, echo, and noise) in the overall circuit are kept 
within acceptable bounds.” 

The general steps in the design of a network services circuit are as 
follows: 

1. Selection of loop facilities (if any) 

2. Selection of interoffice transmission facilities (i.e., routing) 

3. Selection of design architecture 

4, Selection of network equipment 

5. Setting network equipment (i.e., adjusting equipment parameters 
to proper values) 

6. Validation of the circuit (i.e., making sure that the circuit meets 
all transmission and signaling requirements). 

Standard methods have been developed for the design of network 
services circuits.*° These methods, referred to as standard design 
procedures, are sets of guidelines and algorithms that enable the 
designer to perform the above steps in the design systematically. To 
date, however, these procedures have been heavily dependent on the 
capabilities of specific network equipment rather than on their func- 
tions. That is, the procedures assume that only certain specific equip- 
ment items, whose characterizations have been stored, can be used in 
the design. The drawback of this is that whenever a new piece of 
equipment is developed, design procedures must be revised in order to 
incorporate it into the circuit design process. This often causes a 
considerable time lag between the availability of a new piece of 
equipment and its incorporation into the design. 

To alleviate these shortcomings, a generic design concept must be 
developed. Such a design concept would be based on the functional 
features of equipment items rather than their specific technology. We 
discuss the generic design concept in Section II, where we also identify 
the theoretical basis required for the realization of the generic design 
process. The remainder of the paper is concerned with developing this 
theoretical basis. Equipment models are discussed in Section III. An 
analysis tool which can be used to standardize various equipment 
requirements in network services circuits is presented in Section IV. 
An illustrative example is taken up in Section III and is followed 
through in Section IV. A glossary of terms used in the text with their 
definitions is given in Appendix A. The emphasis of the discussion is 
on the transmission aspects of voice-grade network services circuits. 


Il. THE GENERIC DESIGN CONCEPT 


Generic design methods are based on the functional features of 
network equipment. Thus they remove the dependence of design on 
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specific equipment technology. Steps in the generic design process 
include identification of appropriate facility/equipment configura- 
tions, development of standards for equipment functions, and deter- 
mination of design objectives. The required subprocesses in the generic 
design process are listed below: 

1. Estimation of equipment/facility performance capabilities 

2. Definition and priority ordering of various design options (or 
strategies) 

3. Determination of equipment/facility compatibility rules 

4. Identification of various transmission/signaling options 

5. Development of methods for performance validation (i.e., simu- 
lation) 

6. Packaging and deployment of methods (e.g., software develop- 
ment). 

Figure 2 shows the process flow in the provisioning of network 
services circuits. The top part of this figure (Databases) shows various 
databases that contain information about transmission facilities and 
equipment as well as design architectures and design objectives. The 
boxes in the middle row of this figure (Design Steps) show the tasks 
that have to be performed by the circuit designer. They start with the 
Service Order and culminate in the Circuit Work Order (used by the 
installers). Cumulative outputs of the design processes are shown 
above these boxes. The bottom row of Fig. 2 (Design Algorithms) 
shows various algorithms needed to perform the design. The designer 
uses these algorithms in conjunction with the databases to perform 
the design steps. For example, the designer accesses the loop-inventory 
database and uses the loop-selection algorithm to implement the loop- 
selection step of the design, i.e., to select a loop (or loops) for the 
circuit being designed. 

The generic design process has a hierarchical information structure. 
The top level of such a hierarchy contains robust constructs for 
network services circuits. In such constructs one deals with the fea- 
tures required of various equipment rather than their identities. Each 
equipment is represented by a feature code which is a collection of 
feature elements (such as gain and equalization) that provide the 
appropriate transmission and/or signaling treatment for the equip- 
ment’s intended use in a circuit. To fully develop these constructs, 
models are needed for the equipment that can provide adequate and 
representative characterization of equipment behavior. In such models, 
referred to as generic equipment models, the primary characterization 
of equipment would be functional rather than physical. Specific equip- 
ment could be coupled into the generic design process by linking the 
equipment to feature codes through a validation step. 

The lower levels of the information hierarchy further specify the 
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Fig. 2—Process flow in the provisioning of network services circuits. 


requirements of the equipment in the circuit based on circuit config- 
uration and service requirements. Thus the information hierarchy will 
be stable at the top level and only the lower levels may change due to 
equipment technology and type of service. 

Another requirement of the generic design process is an analysis 
tool for evaluating various design strategies, determining equipment 
functional specifications, and predicting circuit performance. Such an 
analysis tool should be capable of simulating circuit performance based 
on nominal facility and eqiupment specifications. It should be able to 
evaluate the effects of variations in the transmission facility and 
equipment specifications due to manufacturing tolerances, quantiza- 
tion errors, temperature changes, aging, etc. The analysis tool would 
be used to determine correct bounds on the elements of equipment 
features in various design configurations. This, in turn, would allow 
the equipment to be categorized (or standardized) according to the 
bounds on the elements of their features. 

The benefits of the generic design process can be summarized as 
follows: 

1. It alleviates shortcomings in the application of the present design 
methods which are limited to specific network equipment. That is, it 
facilitates the incorporation of suitable new equipment into the circuit 
design process. 

2. It promotes the standardization of equipment functions. This, in 
turn, can serve as guidelines for equipment developers. 

3. It reduces the lead time required to incorporate new equipment 
into the circuit design process. Indeed, it enables parallel (rather than 
serial) interaction with the equipment developers. 

4. It facilitates coordination of planning and provisioning. 

We discuss equipment models in Section III. An analysis tool for 
the generic design process is presented in Section IV. Various infor- 
mation management tools would be needed to develop and deploy the 
generic design process. In this connection, some software tools have 
already been developed.® However, we are not concerned with this 
aspect of the generic design process here. 


ill. MODELS 


A generic segment of a network services circuit is a portion of the 
circuit that can be functionally characterized apart from any specific 
equipment technology. For example, consider the network services 
circuit with the three 2-wire cable facilities and two repeaters shown 
in Fig. 3. Any facility alone, any repeater alone, or any repeater with 
one or two adjacent facilities can be considered as a generic segment. 
The models to be described in this section are for 2-wire to 2-wire, 2- 
wire to 4-wire (or 4-wire to 2-wire) and 4-wire to 4-wire generic 
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Fig. 3—A 2-wire voice-grade circuit. 


segments. A generic equipment model is an equipment model in which 
primary characterization of the equipment is functional (e.g., trans- 
ducer and return losses) rather than physical (e.g., resistors, inductors, 
and capacitors). 


3.1 Equipment and facility considerations 


As Fig. 3 indicates, the functional features of voice-grade network 
equipment are gain adjustment, equalization, and impedance control. 
The facility impairments to be considered in generic modeling are loss, 
loss (or attenuation) distortion, and irregular impedance. In developing 
generic equipment models, we consider the following generic segment 
measures: 1-kHz gain (or loss); gain-frequency response; 1-kHz phase 
delay; envelope delay distortion (EDD); and impedance control (return 
loss). Table I summarizes the generic segment measures.’ Column 1 
lists the generic segment measures; column 2 lists the parameter from 
which each measure is derived; column 3 lists the definition and the 
dimension of each measure; and column 4 lists the typical form of 
specification of each generic segment measure. For example, the 
1-kHz gain is defined as 10 log[g,(1 kHz)], where g,, is the forward 
voltage gain of the equipment. This measure is typically specified by 
a nominal value with symmetric tolerances as 3.5 + 1.0 dB. The 
balance return loss and the structural return loss in Table I are 
typically specified by minimum and median weighted values.® 


3.2 Parameterization 


To represent a linear network, one can use any one of the following 
sets of parameters: the open-circuit impedance (or the z) parameters; 
the short-circuit admittance (or the y) parameters, the transmission 
(or the ABCD) parameters; the scattering (or the s) parameters, the 
hybrid (or the h) parameters. In this study we will use unnormalized 
voltage scattering parameters” to characterize a generic segment for 
the following two reasons. First, not all networks possess the z or the 
y parameters; however, any linear network can be represented by a 
scattering matrix. Second, most of the commonly used network func- 
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Table I—Generic segment measures, their derivatives, definitions, 
and typical specifications 


Generic Segment Parameter Derived Definition and Typical Form of Speci- 
Measure From Dimension fication 
1-kHz gain Forward gain g,, 10 log[g.(1 kHz)] Nominal value with 
Reverse gain g,, Decibels symmetric toler- 
(see note 1) ances 
Gain-frequency Forward gain g,, 10 log[g.(f)] Template over band 
response Reverse gain g., Decibels 400-2800 Hz and 
(see note 1) referenced to 1 kHz 
Phase delay Forward gain gy T,(1 kHz) Maximum value at 1 
Reverse gain 2,, Microseconds kHz 


(see note 2) 


Tf) oa Tel fret) 
Microseconds 
(see note 3) 


Maximum difference 
of envelope delay be- 
tween any two fre- 
quencies over 500- 
3000 Hz band 


Forward gain g., 
Reverse gain g,, 


Envelope delay 
distortion 


Balance return Input: —20 log[} 6:|] Minimum weighted 
loss Balance reflection Decibels values; e.g., echo and 
coefficient 6, singing (high and 
Output: low) return loss 


Balance reflection 
coefficient 52 


Structural return Input: —20 log[] pu] ] Minimum weighted 
loss Structural reflection Decibels values 
coefficient p11 
Output: 


Structural reflection 
coefficient p22 


1. g, (equipment voltage gain) can be g,y or g,, 
1 
2. T,(f) =—- a arg(sai) 


3. T.(f) = <. [—w-arg(so1)] 


tions, e.g. the input and the output impedances, the transducer loss, 
the return loss, and so on, can be easily related to the scattering 
parameters (i.e., the elements of the scattering matrix). 

As an example consider a 2-wire generic segment, a 2-port network, 
which is represented by a 2 X 2 scattering matrix {s;}, i, j = 1, 2. It 
can be easily shown that the input and output impedances are given 
by 


1+ 
Zin = ae Zor (1) 
— $11 
1+ S22 
Len = = Zi 3 
ee Pe 02 (2) 


where Zo; (i = 1, 2) is the impedance seen looking from the repeater 
into facility i with a resistive termination (nominally 600 or 900 ohms). 
Also the forward transducer gain is given by’ 
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= Re(Zo1) Re(Zoz) 


a? ea ° 
and the reverse transducer gain is given by 
tg, = Re(Zo)Re(Zoe) I si]. (4) 
|Zo1| 
The return loss RL; (i = 1, 2) at port i can be shown to be’ 
RL; = —20 log |sii|. (5) 


One can see that the s parameters (i.e., $11, S12, $21, aNd Sep) of a 2-wire 
generic segment can be obtained from these network functions. It 
should be mentioned that an input reference impedance and an output 
reference impedance are required in order to represent a linear 2-port 
network by a scattering matrix.? In this study, Zo, and Zo2 represent 
the input and the output reference impedances (see Fig. 4). 


3.3 Reference environments 


There are two reference environments that are commonly used to 
specify transmission performance measures: (1) the driving-point (or 
impedance control) environment, and (2) the gain-loss related envi- 
ronment. The development of generic equipment models is based on 
these two environments. However, the final model is characterized in 
the gain-loss related environment. 

The impedance control measures, such as the structural return loss,® 
balance return loss, and equal-level echo path loss (ELEPL),’ are 
specified in the driving-point environment. The 1-kHz loss, the gain- 
frequency response, the phase delay and the envelope delay distortion 
(Table I) are specified in the gain-loss related environment. 

The reference impedances in the gain-loss related environment are 
called the primary input and output reference impedance (Fig. 4). 
Those in the driving-point environment are called the secondary 
reference impedances. 


3.4 Gain definition 


The following approximation has been widely used by network 
services circuit design engineers: the end-to-end circuit transducer loss 
is equal to the sum of the facility transducer losses minus repeater gains. 
For a 2-wire network services circuit with only one repeater, the 
statement becomes valid if the following definition for forward repea- 
ter voltage gain is adopted for a repeater:” 


S21 JR./R> 
[p= Ba ss 6 

Sof (1 ne a)(1 = a2) i/ 2 ( ) 
where 
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BASIC PARAMETERS: 


Zoi/Zo2 = primary input/output references impedances 


Z:/Z2 = secondary input/output reference impedances 
Woi/Wo2 = nominal input/output impedances 
R,/Rz = nominal input/output resistances 
Pu/p22 = input/output structural reflection coefficients 
6;/62 = input/output balance reflection coefficients 


8/8 = forward/reverse voltage gain 


DEFINED AND DERIVED PARAMETERS: 
Wii = Loi. Zi im Loi R; = Loi 
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Fig. 4—A 2-wire generic segment and its associated parameters. 


Le ze 
R; + Zoi 


Zo1(Zo2) is the primary input (output) reference impedance, and R; is 
the nominal input (i = 1) or the nominal output (i = 2) resistance.’ R, 
and R, are often associated with the generic segment in the gain-loss 
environment. They are used for purposes of gain administration. They 
do not represent the true input and output resistances. For a circuit 
with two repeaters (Fig. 3), even with the gain definition given in eq. 
(6) the statement given earlier in this section is not precise any more. 
Section 3.7 will discuss this in detail. 

A generic equipment model, for a 2-wire generic segment, reflecting 
common performance measures, has been developed based on the gain 
definition given in eq. (6). This model is represented by the following 
2 X 2 scattering matrix: 


A d= vr 1 
ae Y10811 = VT+ x Bu-(1 + az) 


Qa; 


i = 1, 2, (7) 


2710 
-(1 — a;) VRy/Re > , 
{si} = Bie lt 1 — Y2S22 
| Bo + ax)(1 — a2) VER, OF > (8a) 
-{[l1-— v1 + x] 
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Fig. 5—A 4-wire to 2-wire (or 2-wire to 4-wire) generic segment. 


where $i, yw, i = 1, 2, g.4, and g,, are defined in Fig. 4 and 


Ay wy2H812821 (8b) 


x= = care 
= 10811) (1 a 20822) 


Detailed information on this 2-wire model can be found in Ref. 7. 


3.5. Two-wire to four-wire and four-wire to four-wire segments 


In this section we first describe the model of a 4-wire to 2-wire (or 
2-wire to 4-wire) generic segment (Fig. 5). The “receive” path is 
designated as port 1, the “transmit” path as port 2, and the 2-wire 
port as port 3. The segment can be represented by a 3 X 3 scattering 
matrix. 

In most applications of interest, we can assume that both the 
transmit and the receive paths are unidirectional. This assumption 
results in three zero elements: sj2 = s;3 = 0. Thus, the resulting 
scattering matrix becomes 

$11 0 0 
{sj} = | Sor S22 Sp |. (9) 
S31 O Sag 


Next, we will relate the matrix elements with the transmission per- 
formance measures. The diagonal elements are related to the return 
losses as follows: 


Si = px (t= 1, 2) (10) 
and 
n3 + p33 
$4 = 11 
1 + napss ve 
where 
return loss at port 1 = —20 log| pi|, (12) 
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= Wos = Zo 
Wos + Zos’ 
and Zo; is the primary reference impedance at port 3 while Wo3 is the 


nominal input impedance at port 3. 
The transmit path voltage gain, 223, is related to so3 as follows: 


Seg = go3(1 + a3)VRi/Rs, (14) 


where R; is the nominal resistance at port 1 or 2, R3 is the nominal 
resistance at port 3, 


13 (13) 


_ Wes — Zen 
Woz + Zos’ 


and 23 is the voltage gain from port 3 to port 2. The receive path 
voltage gain, g32, is related to s3; as follows: 


S31 = 830(1 — a3)VR3/R1. (16) 


The element sz; can be obtained as follows. When port 3 is terminated 
by a balance impedance Z3, the Equal-Level Echo Path Loss from 
port 1 to port 2 is given by 


ELEPL12 = —20 log|sais| — (T — R), (17) 


where —20 log| soz] represents the voltage gain, in dB, from port 1 to 
port 2, T is the transmitted signal level at 1 kHz at port 1, and R is 
the received signal level at 1 kHz at port 2. We define 


(15) 


a3 


_ 23 — £3 
ae Tite, (18) 

a Z3 — Zos 
3= Zs + Los es (19) 
Zs» — Zos (20) 


where Zo3 is the primary reference impedance and Zs; is the secondary 
reference impedance at port 3. It can be shown that 


¥3 + 63 


———-. 21 
1 + 363 a) 


Y3b = 


The reference impedances for the final model are {Zo1, Zo2, Zo3}, while 
the reference impedances for S213 are {Zo1, Zo2, Z3o}. The conversion 
technique for converting the scattering parameters under a change of 
reference impedances must be used.® The conversion factors are 


T,=Il,=0 and I3= yz, (22) 
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and it can be shown that 


Y 36823831 
$21B = Sa. + ——_ 
1 = Y3b833 
or 
Y 30823831 
So1 = S21B — (23) 
1 — 35533 


Thus, a 2-wire to 4-wire (or 4-wire to 2-wire) segment is represented 
by a 3 X 8 matrix, eq. (9), with its elements described in (10), (11), 
(14), (16), and (23). 

A 4-wire to 4-wire segment (Fig. 6), in reality, can be considered as 
two unidirectional 2-port networks. A 2 X 2 matrix can be used to 
represent either the transmit or the receive path: 


{sy} = ie : } (24) 


S21 S22 
where s;; (i = 1, 2) is a function of return loss at port 1, 
S21 = 8 VRi/Re, (25) 
and g2; is the gain from port 1 to port 2.. 


3.6 Example 


We will use the network services circuit shown in Fig. 3 as an 
example for the following studies: 

1. Circuit (end-to-end) transducer loss of a 2-repeater circuit 

2. Applications of generic equipment models to 2-wire network 
services circuit design 

3. Sensitivity of transmission performance measures to return loss 
deviations. 

It can be shown that for a 2-repeater/3-facility 2-wire network 
services circuit the transducer loss, L (in dB), is given as’ 


e Ro Roe 
b= Y- 3G +H, + 10 log —— (26) 
i=1 RyRy’ 
a SS SSS SS = 
ef = 
1 Z 
| | 
| | 
|} 
2 1 
| | 
Wea ans aah ce ae eee es 4 


Fig. 6—A 4-wire to 4-wire generic segment. 
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where L; is the transducer loss, in dB, of facility i; G; is the voltage 
gain, in dB, of repeater 7. Also 


L. = 20 log|te| (27a) 
te = 1 — fo21P11282128122, (27b) 


where p22: is the output reflection coefficient of repeater 1, pi12 is the 
input reflection coefficient of repeater 2, so:2 is the sq; of repeater 2, 
Sig2 is the S12 of repeater 2, and R,; (R2;) is the nominal input (output) 
resistance of repeater i. 

In practical applications we have Ri, = R2; and Riz = Re. Therefore 
eq. (26) becomes 


(avs Cag. (28) 


Thus, for a 2-repeater/3-facility 2-wire network services circuit, the 
circuit transducer loss is equal to the sum of facility transducer losses 
minus the sum of repeater gains plus an error term, L,. In most 
applications, L, has a value between 0 and 0.3 dB. 

The models developed in this section have been applied to several 
cases. The 2-wire network services circuit shown in Fig. 3 has been 
analyzed. A chi-square random number generator was used to generate 
return losses meeting specified minimum and median requirements. 
(The rationalization for using chi-square distribution can be found in 
Ref. 10.) A Monte Carlo method was applied for the circuit simula- 
tion.!! The Universal Cable Circuit Analysis Program (UNICCAP) 
was used to analyze the specific repeaters, used in the circuit.’ The 
results were in complete agreement with the results obtained from 
generic equipment models. 

The transducer loss or the attenuation distortion requirements of a 
network services circuit are specified by upper and lower templates 
for gain-frequency response of the circuit. Let L(/) be the transducer 
loss at any frequency f. Then the attenuation distortion D(/f) is defined 
as 


D(f) = L(f) — LQ kHz). (29) 


Figure 7a shows the upper template, D,,, the lower template, D,, and 
the median curve, D,, for a typical attenuation distortion objective. 
For analytic purposes, these templates are smoothed as shown in Fig. 
7b. Note that with a proper scale for the vertical axis in Fig. 7b, the 
templates specify the transducer loss instead of the attenuation dis- 
tortion requirement of the circuit. 

The results for the circuit shown in Fig. 3 are given in Fig. 8. It is 
clear from Fig. 8 that this 2-repeater/3-facility network services circuit 
provides a satisfactory performance. 
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Fig. 7—Template and median curves for attenuation distortion objective: (a) Atten- 
uation distortion objective; (b) smoothed attenuation distortion objective. 


A sensitivity study of transmission performance was also made for 
this example. It was found that deviating all the return losses 6 dB 
from their nominal values would result in a circuit that would no 
longer provide satisfactory performance. Figure 9 shows the gain- 
frequency response in this case. This confirms with practical experi- 
ence that return loss (or impedance mismatch) is a critical transmis- 
sion performance measure. 

The results obtained from this study indicate that the generic 
equipment models developed here are valid and useful. It is easy to 
apply them to assess a candidate design process in the generic design 
of network services circuits. 
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Fig. 8—Gain-frequency response with typical return-loss performance. 


IV. THE ANALYSIS TOOL 


Network services circuits are designed based on the nominal infor- 
mation about the transmission facilities and equipment. However, the 
parameters of the transmission facilities and equipment are subject to 
inevitable variations around their nominal values. These variations 
may be due to temperature changes, quantization, aging, etc. Statistical 
characterizations of these variations are often available from previous 
data. The design objective must be satisfied despite these variations. 

The problems encountered here involve the determination of con- 
straints on some parameters of the circuit while constraints on the 
remaining parameters of the circuit are specified. More specifically, 
let us assume that the design configuration, the design strategy, and 
the design objective are known. The associated problems can then be 
divided into the following three types: 

1. Given the specifications of the transmission facilities, determine 
the type of equipment to be used. (Note that if generic equipment 
models are available for each design configuration, then the type of 
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Fig. 9—Gain-frequency response with degraded return-loss performance. 


equipment will refer to the constraints on the parameters of the 
models.) 

2. Given the specifications of the equipment, determine the maxi- 
mum application range of the facility sections. 

3. Given partial specifications of the transmission facilities and 
equipment, determine the constraints on the remaining equipment 
parameters and the maximum range of the remaining facility sections. 

Regardless of the type of problem, the analysis tool should be capable 
of determining the constraints on the unknown parameters of the 
circuit based on the available information about the specified param- 
eters, such that the design objective is satisfied. In the following 
subsections we first discuss the nature of the objective functions in 
such problems. We then describe the analysis tool used to determine 
the optimal constraints. Finally, we illustrate the results by an exam- 
ple. 


4.1 Nature of objective functions 


Network services circuits must be designed such that circuit require- 
ments on each overall performance measure are satisfied. Due to 
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variations in the transmission facility and equipment parameters, 
circuit requirements cannot be fixed at exact deterministic values. 
(Such exact values are referred to as target or design-center values.) 
The circuit requirements are, of necessity, statistical in nature; that 
is, variations around design target values are specified statistically. 

An example of a circuit requirement is that related to circuit 
transducer loss at 1 kHz, L(1000), as described blow. If the target 
circuit transducer loss at 1 kHz is L(1000), then it may be specified 
that in PC; percent of circuit samples, the transducer loss is not to 
deviate by more than A; dB from the design target value (i = 1, 
2, -++); that is, | (1000) — £(1000)| < A;. We define 


NP, = |L(1000) — L£(1000) | (30) 


as a normalized circuit parameter and specify statistical bounds on 
NP,. The collection of such statistical circuit requirements on the 
normalized circuit parameters is collectively referred to as the design 
objective. Figure 10 shows an element of the design objective for a 
typical special services circuit, i.e., a foreign exchange (FX) line. In 
this figure, which refers to circuit transducer loss, we have NP, = 
| L(1000) — 3.5] and 


NP, < 0.5 dB, PC, = 95; NP, <= 1.0 dB, PC, = 5, (31) 


Note that the design target value for NP, is zero. The cumulative 
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Fig. 10—Representative transducer-loss design objective for an FX line. 
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distribution of NP, for a large number of circuits designed by the 
application of a certain design strategy is referred to as the design 
performance. A typical design performance curve is also shown in Fig. 
10. Note that any design performance curve that falls to the left of 
the “edge of the design objective” in Fig. 10 is acceptable. 

Another example of a circuit requirement is that related to atten- 
uation distortion. We define the normalized circuit parameter in this 
case as 


NP2= min Yy(f), (32) 
400< f<2800 
where y( f) is the latent loss distortion of the circuit, defined as follows: 
D(f) — Dz 
D, —D- for D-s D(f) <= Dn 


V(f) = B= pe for Dns D(f) = Dy (33) 


D(f) — D-, for D(f) < D- 
D, — D(f) for D(f) > Dy. 


In the above equation, D(f) is the difference between circuit transducer 
loss at frequency f and at 1000 Hz given in eq. (29), D-and D, are the 
lower and upper bounds on the loss-versus-frequency characteristic of 
the circuit, and D,, is the median curve between D,and D,. Figure 11 
shows the upper and lower templates for gain-frequency response of a 
foreign exchange (FX) line where, for convenience, the curves D-and 
D, are made piecewise linear. Figure 12 shows the statistical bounds 
on NP». In this figure we have 


D(f) = L(f)}-L(1000) 
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Fig. 11—Piecewise linear template and median curve for attentuation distortion 
design objective of an FX line. 
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Fig. 12—Representative attenuation distortion design objective for an FX line. 


NP, > —0.5, PC, = 5; NP, = 0, PC, = 95. (84) 


Note that the target value for NP» (as well as the maximum value it 
can assume) is 1. 

Equation (34) indicates that a 5-percent deviation from the design 
requirement related to attenuation distortion will be tolerated. An- 
other way of interpreting this is that the design process will produce 
circuits that meet requirements in 95 percent of the time (assuming 
ergodicity). 


4.2 Methods of optimization 


The optimization problem under consideration can be stated as 
follows. Given the nominal values of and the statistical information 
about some parameters of the circuit, determine the constraints on 
the remaining parameters of the circuit such that the design objective 
is satisfied. 

We use a Monte Carlo procedure”! to simulate the circuit, i.e., to 
include the deviations from nominal parameter values, and to evaluate 
the design performance. To illustrate the procedure, consider a circuit 
including a set of unspecified parameters p;, 1 = 1, 2, ---, n (as well 
as a set of specified parameters) that are subject to variations. The 
circuit is assumed to meet the design target when parameters are at 
their nominal values. The value of each parameter can vary according 
to a certain statistical distribution within a range containing its 
nominal value. Parameter values may have upper bounds, lower 
bounds, or both upper and lower bounds. The problem is to determine 
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bounds on the values of parameters p; which result from meeting the 
design objective. The basic method to be used consists of the following 
three steps: 

1. Determine the nominal parameter values p;, i= 1, 2, ---,n. 

2. Choose suitable initial bounds for parameter values around their 
nominal values. 

3. Analyze and evaluate the circuits resulting from the random 
choice of parameter values within their corresponding (initialized or 
specified) bounds. 

4. Determine how these bounds should be modified to improve the 
design performance. 

Steps 3 and 4 are repeated until the design objective is met. These 
steps involve statistical analysis of the results and their interpretation. 

After determining the nominal values for the unspecified circuit 
parameters, one of the following three basic approaches can be em- 
ployed: 

1. Choose arbitrary initial bounds on parameter values and use 
iterations to tighten or loosen the bounds. 

2. Choose sufficiently large initial bounds on parameter values and 
use iterations to tighten the bounds. 

3. Choose sufficiently small initial bounds on parameter values and 
use iterations to loosen the bounds. 

We use a combination of the first and second approaches. More 
specifically, we use the first approach to determine suitable initial 
bounds on parameter values. Note that the nominal values of param- 
eters must fall within these initial bounds. The initial bounds on 
parameter values must be sufficiently loose so that no unnecessary 
constraint is placed on any parameter. At the same time, they should 
not be so loose as to cause the procedure not to converge in a reasonable 
number of iterations. We then gradually tighten these bounds, through 
iterations, until the design objective is met. 

The flowchart of the above procedure is shown in Fig. 13. In this 
figure, the INITIALIZER is a software module which provides initial 
bounds on parameters p; based on the properties of these parameters 
and the circuit. The Monte Carlo circuit analysis program in each 
pass uses samples of parameters p;, generated by proper random 
number generators, and analyzes the resulting circuit to determine 
whether or not it meets the requirements. The CIRCUIT ANALYSIS 
MODULE in the flowchart of Fig. 13 is a software module which, for 
given design configuration, facility specifications, and equipment pa- 
rameters, analyzes the circuit (e.g., by using the methods of Section 
III) and determines the normalized circuit parameters. The results of 
each pass through the algorithm are stored. The INTERPRETER in 
Fig. 13 is a software module that uses the stored results to determine 
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Fig. 13—Statistical analysis procedure. 


the parameter whose bounds must be modified, as well as the new 
bounds for it such that a larger proportion of circuits will meet the 
requirements in the next Monte Carlo run. The algorithms used by 
the INTERPRETER are explained in Appendix B. The Monte Carlo 
runs are repeated until a set of bounds are produced for parameter 
values that meet the design objective. 
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4.3 The equipment requirements optimization process 


One of the objectives of the generic design process is to facilitate 
the incorporation of general network equipment products into the 
circuit design process. To accomplish this, one has to determine the 
admissible ranges of values for equipment parameters used in various 
design configurations. In this section we will explain a computer-aided 
approach developed to determine the widest possible ranges for the 
values of equipment parameters in a given design configuration. More 
specifically, we assume that the design configuration and the facility 
specifications are known. We want to find maximum tolerances for 
the equipment parameters based on the given design strategy and 
design objective. Let us indicate the equipment parameters by pj; i = 
1,2,---,nj;J=1,2, ---,N, where n; denotes the number of parameters 
related to equipment E;, and N denotes the total number of equipment 
items in the circuit. The basic flowchart for the equipment require- 
ments optimization process is shown in Fig. 14. The process consists 
of four steps: 

1. Determine the nominal circuit solution. 

2. Check the design strategy. 

3. Initialize the process. 

4, Analyze and interpret. 

These procedures will be explained below. 


4.3.1 Determine the nominal circuit solution 


In this procedure the facility specifications are fixed at their nominal 
values. The design strategy is then applied to determine the nominal 
values of the equipment parameters, i.e., pj, 1 = 1, 2, ---, nj; 7 = 1, 
2, ---, N. This corresponds to the initial setting of the equipment in 
the circuit. 


4.3.2 Check the design strategy 


This procedure determines whether the design strategy will lead to 
a satisfactory circuit when all the equipment parameters are fixed at 
their nominal values. Through a Monte Carlo procedure the circuit is 
subjected to variations in the lengths of facility sections, and the 
design performance (i.e., the cumulative distributions of various trans- 
mission performance measures) is determined. The block indicated as 
PERFORMANCE COMPARATOR in the flowchart of Fig. 14 com- 
pares the design performance P with the design objective ®. If the 
design objective is met (indicated by P = ®), the process will continue 
to the initialization procedure. Otherwise, it is concluded that the 
design strategy is not suitable and the process will be aborted. 
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Fig. 14—Basic flowchart of the equipment requirement optimization process. 
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4.3.3 Initialize the process 


The initialization procedure determines suitable initial constraints 
for equipment parameters. The method of choosing these initial con- 


straints is as follows: 
1. For one-sided equipment parameters with upper bounds, we let 


py = Di = Oy, (35) 


where random variable 6; represents variations around the nominal 
parameter value p;. It is chosen from a uniform distribution whose 


mean and standard deviation are both equal to x;,;| p;;|, where x; is an 


input constant. 
2. For one-sided equipment parameters with lower bounds, we let 


Di = Diy — Oy, (36) 


where the random variable 6, is chosen from a uniform distribution 
whose mean and standard deviation are as given above. 

3. For two-sided equipment parameters, we use p; given by eq. (35), 
where the random variable 6, is chosen from a uniform distribution 
with zero mean, and the standard deviation x,|pj|, where xj is an 
input constant. Initially all factors x; are chosen equal to x, which is 
an input to the process (e.g., x = 0.1). 

To check the suitability of the initial constraints on equipment 
parameters based on the input value of x, a Monte Carlo analysis is 
performed in which samples are taken from the parameters of unspec- 


NETWORK SERVICES CIRCUITS 761 


ified equipment just defined. Note that at this point facility lengths 
will be fixed at their nominal values. A bad choice for x may overcon- 
strain some equipment parameters, or may cause the problem not to 
converge to a suitable solution in a reasonable number of runs. 
Decision on the suitability of the value of x is based on the values of 
the sensitivity factors M; of parameters p; as explained below. (See 
Appendix B for the definition of sensitivity factor.) We compare the 
sensitivity factors M; with two input values Mnin and Max. (Typical 
values: Myin = 0.4, Mmax = 1.6.) 

1. If Mj < Muin, it is concluded that equipment parameter pj; is 
undersensitive to variations; thus the initial range for this parameter 
must be loosened. 

2. If Mi > Mmax, it is concluded that p; is oversensitive to variations 
and the corresponding initial range must be tightened. 

3. If Mmin = Mi < Mmax, the initial range for p; will not be changed. 
Note that the initial range for equipment parameter p, is related to 
the value of x,;. To adjust the initial range, new values for x; are 
determined according to the following algorithm, which has proved to 
work satisfactorily. For 1 = 1, 2, ---, nj; j = 1, 2, ---, N; if My < 
Monin, 


ne = (1 + Mnin > M;j)x?!", (37) 
if M; > Mrnax, 
new = (1 + Minox — Mi) x9!4; (38) 


and if Main = M; = M sas: 
xpew = old (39) 
4.3.4 Analyze and interpret 


This part of the process involves iterations during which the initial 
equipment parameter ranges are tightened just enough to achieve 
satisfactory design performance. The procedure is based on a Monte 
Carlo analysis where samples are chosen from both facility length and 
equipment parameter distributions. After each Monte Carlo run, the 
block indicated as PERFORMANCE COMPARATOR in the flow- 
chart of Fig. 14 compares the design performance P with the design 
objective ®. If the design objective is not met, the constraint for at 
least one parameter will be modified (as will be explained) and the 
Monte Carlo analysis procedure is repeated. The Monte Carlo runs 
continue until the design objective is met, at which point the final 
constraints for equipment parameters are determined. 

The procedure for the modification of constraints for equipment 
parameters is based on the algorithms described in Appendix B. This 
procedure is as follows. Let 
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max My = Mi, (40) 
uJ 


where M,, is the sensitivity factor corresponding to equipment param- 
eter p;,. Consider the following cases: 

1. pj is one-sided with an upper limit. In this case the intersection 
of “Y” and “N” histograms for p,; will be the new upper limit (see Fig. 
15a). 

2. pi is one-sided with a lower limit. In this case the intersection of 
“Y” and “N” histograms for p; will be the new lower limit (see Fig. 
15b). 

3. py is two-sided. In this case the intersection of “Y” and “N” 
histograms for p; defines the new range for this parameter as shown 
in Figs. 15c and 15d. (If “Y” histogram is on the right side of “N” 
histogram, the new range will be determined from the “Y” histogram, 
as shown in Fig. 15c.) 


4.3.5 Incremental analysis 


It is desirable to let the nominal characteristics (e.g., loss) of each 
facility section vary within a range. In such a case we would like to 
determine the corresponding ranges for the nominal values of equip- 
ment parameters. Also, it will still be desirable to determine the loosest 
possible constraints on equipment parameters for each set of nominal 
facility specifications. An example (Section 4.4) will demonstrate that 
the nominal values of equipment parameters depend only on the 
nominal facility specifications, and that the optimal variations around 
the nominal values of equipment parameters depend only on the 
variations around nominal facility specifications. 


4.4 Example 


This section demonstrates the equipment requirements optimization 
process on a foreign exchange (FX) line. The circuit under consider- 
ation consists of three 2-wire metallic facility sections and two re- 
peaters, as shown in Fig. 3. In general, metallic facility sections can 
be specified by their physical parameters such as length, gauge, and 
loading characteristics. Using the Universal Cable Circuit Analysis 
Program (UNICCAP),” the loss for each facility section can then be 
calculated for standard terminations at any frequency. We assume, 
for convenience, that the facility sections in Fig. 3 are specified via 
their respective nominal losses at frequencies of 400, 1000, and 2800 
Hz. Further, we assume that the loss at frequency f of each facility 
section F; is subject to variation A,(f). Thus, the losses for the facility 
sections are given by 


f = 400, 1000, 2800 Hz, (41) 


NETWORK SERVICES CIRCUITS 763 


b86L ANNI-AVW “TVNANOfl TWOINHD3IL = #9Z 


= NEW RANGE ele 


(a) 


(c) 


“~NEW UPPER 


LIMIT 


Pry 


N 
| ‘“~\OLD UPPER 
LIMIT 


Pry 





Pry 
4 A R 
OLD LOWER _-~ NEW Oeheee ~NEW RANGE — 
LIMIT 
& -——_——O OLD RANGE--—-—-—-—— e 
(b) 
N Y N 
Pry 


| gl new | | 


RANGE 


Soon y ees 


(d) 


Fig. 15—Typical “Y” and “N” histograms for equipment parameter p,,. (a) One-sided 
parameter with upper limit. (b) One-sided parameter with lower limit. (c), (d) Two- 
sided parameter. 


where L;(f) indicates the nominal loss of facility section F; at frequency 
f and A,(f) is a random variable indicating variations around [;(f). 
The type of distribution for A,;(f) can. be specified as well. Here we 
assume that A,(f), i = 1, 2, 3 are normally distributed with zero means 
and with specified standard deviations. We indicate the lower and the 
upper limits of nominal loss of facility F; by I; and L,;, respectively. 
These values as well as the values of A;(f) for the example under 
consideration are given in Table II. 

The equipment parameters used in this example are the gains of the 
repeaters at frequencies 400, 1000, and 2800 Hz, i.e., 


Pu = G,(400), Pa = G,(1000), P31 = G,(2800) 
Piz = G2(400), P22 = G,(1000), P32 = G,(2800), (42) 


where G;(f) indicates the gain of repeater E; at frequency f. The above 
parameters are all two-sided, i.e., an upper bound and a lower bound 
must be determined for each. Similar to eq. (41), we can write 


G(f) = Gf) +6(f, j=l, 2 
f = 400, 1000, 2800 Hz. (43) 


We are looking for the values G;(f) as well as the characteristics of 
random variables 6;(/). 

The normalized circuit parameters NP; and NP, and the design 
objective defined in eqs. (31) to (83) and depicted in Figs. 10 and 12 
will be used in this example. Using eqs. (28), (41), and (43), we can 
easily show that 


3 2 
NP, = | & A,(1000) — > 6,(1000) |. (44) 
Fd 


1=1 


Table II—Facility specifications 


Nominal 
Loss and 
Facility Std. Dev. 
Section (in dB) At 400 Hz At1000Hz At 2800 Hz 
In(f) 3 5 9 
1 Lu (f) 5 7 11 
Ai(f) 0.2 0.1 0.3 
Lv (f) 7 9 11 
2 U2 9 11 13 
Ao(f) 0.1 0.05 0.2 
Lis (f) 4 7 11 
3 Lus (f) 6 9 13 
A3(f) 0.2 0.1 0.3 
Notes: Li(f) = £ 


Li(f) + Alf) and Li(f) = Lif) s Lulf), 
1, 2, 3; f = 400, 1000, 2800 Hz. 
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Also, 


D(400) = D,,(400) + y {A;(400) — A,(1000)} 


{5,(400) — 5,(1000)}, (45) 
D(1000) = 0, (46) 


iM» 


and 


D(2800) = D,,(2800) + y [A;(2800) — A;(1000)] 


= z [6;(2800) — 6;(1000)], (47) 


where D,, is defined in Fig. 11, and A;(f) and 6;(f) are defined in eqs. 
(41) and (48), respectively. Thus the latent loss distortion ¥(f) in eq. 
(32) can be calculated for f = 400, 1000, and 2800 Hz, and the 
normalized circuit parameter NP? in eq. (32) can be expressed in terms 
of A;(f ) and 6;(f). 

Design strategy refers to the transmission plan which, for specified 
facility sections, determines the nominal values of equipment param- 
eters (i.e., the values of equipment parameters when all circuit param- 
eters are at design-target values and facility sections are not subjected 
to deviations). The transducer loss of the circuit of Fig. 3 at frequency 
f can be calculated as follows: 


L(f) = Li(f) + Le(f) + La(f) — Gif) — G2(f), (48) 


where L;(/) indicates the loss of facility section F; at frequency f Hz. 
Equation (48) is the same as eq. (28) where the error term L, has been 
neglected for convenience. The above formula holds, of course, also 
for nominal values: 


L(f) = Iy(f) + Lo(f) + Lal f) — Gi(f) — Ge(f). (49) 


A typical design strategy for the example under consideration is as 
follows. For the gain adjustment aspect of the strategy, we use the 
following formulas:?® 


= is L. : 

G,(1000) = £,(1000) + aa 2 (50) 

L;(1000) — 3.5 
2 2. 

where 3.5 GB is the design-target value for the 1-kHz circuit transducer 

loss. Solution of eqs. (50) and (51) results in the nominal values of 


G2(1000) = £3(1000) + (51) 


766 TECHNICAL JOURNAL, MAY-JUNE 1984 


repeater gains at 1 kHz. For the equalization aspect of the design 
strategy we use the following formulas [which are similar to eqs. (50) 
and (51)]: 


us Dl f) 





G\(f) — Gy(1000) = Dy f) +42 — Pas (52) 
Ga(f) — G4(1000) = Dy(f) + 74D _ Pall) (53) 


for f = 400 and 2800 Hz, where 
Df) = Lf) — [1 kHz) (54) 


represents the attenuation distortion corresponding to facility section 
F; [cf. eq. (29)] and D,,(f) is defined in Fig. 11. Solution of eqs. (52) 
and (53) results in the nominal values of repeater gains at frequencies 
400 and 2800 Hz. Thus the implementation of the design strategy 
described above results in the nominal values of the six equipment 
parameters, i.e., G;(f), j = 1, 2, f = 400, 1000, 2800 Hz based on the 
specified nominal values L,(f) for facility sections. If the nominal 
values [;(f) are allowed to vary in the range [L,;(f), Lui(f)], then the 
corresponding ranges [G;(f), G.;(f)] for each repeater nominal gain, 
G;(f), must be found. To do this, we allow the nominal values L;(f) to 
vary randomly within their specified ranges [L;(f), Lu:(f)] according 
to their specified (or assumed distributions). For each set of nominal 
facility losses, we determine the corresponding nominal repeater gains. 
Then the lower bounds and the upper bounds of nominal repeater 
gains will be determined. 

A FORTRAN program, named EROP, was written for this example. 
Table III shows the corresponding output for the input specified in 
Table I]. The number of iterations to achieve the final solution was 
15. 


Table III—EROP output for example in Section 4.4 
Re- Nominal 
peater Gain(indB) At 400 Hz At 1000 Hz At 2800 Hz 
1 Gu(f) 3.750 7.881 11.469 
Gulf) 6.692 10.701 13.939 
2 Gl f) 4.752 10.033 13.226 
Golf) 7.616 12.574 15.939 


Deviations of repeater gains (in dB): 


—0.2 < 6,(400) < 2.0, —0.2 < 6,(1000) = 0.4, -—2.8 < 6,(2800) <= 1.6 
—1.6 < 5,(400) < 0.2, —0.2 = 6,(1000) = 0.2, —1.2 = 6,(2800) <= 2.8 


Note: Gf) = Gf) + 4(f) and G(f) = G(f) = Gulf), 
j= 1, 2; f = 400, 1000, 2800. 
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V. CONCLUSION 


A perspective on the design of network services circuits has been 
discussed and recent developments in modeling and analysis tools 
have been reported. Models that have been addressed cover voiceband 
network equipment and are parameterized on a functional basis in 
terms of commonly used transmission measures. Tests show the 
models are consistent with the behavior of network equipment prod- 
ucts. The optimization problem has been studied under general scen- 
arios of fixed facility specifications, fixed equipment specifications, 
and partial specifications of facilities and equipment. Corresponding 
optimization methods have been developed and implemented in mech- 
anized analysis tools to provide a viable procedure for determining 
needed standards on network equipment. These developments are 
providing much of the foundation needed for a generic approach to 
the design of network services circuits. Although the discussion has 
centered around voiceband cables and repeaters, the optimization 
techniques reported here can also be used in digital applications. 
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APPENDIX A 
Glossary 


This appendix describes the terminology used in the main text of 
the paper. The terms are arranged in alphabetical order. 


Circuit—The telecommunications transmission path providing the network services 
extending from one end (customer location or central office) to another. 

Circuit parameters—Parameters that are important measures of circuit transmission 
characteristics (e.g., circuit transducer loss). 

Design architecture—A circuit topology consisting of generic transmission facilities and 
equipment. 

Design center (or design target)—Deterministic values that are specified targets for 
circuit parameters. 

Design configuration—A circuit topology consisting of physical equipment and trans- 
mission facilities. 

Design objective—Specified bounds on the statistical distributions of normalized circuit 
parameters. 

Design performance—Statistical distributions of normalized circuit parameters. 

Design strategy—A transmission plan that determines the nominal values of equipment 
parameters. 

Equipment functional specifications—Constraints for values of equipment parameters 
with their corresponding distributions within the constraints. 

Equipment parameters—A set of quantities whose values determine the characteristics 
of the equipment. 

Facility configuration—A circuit configuration that represents a physical combination 
of transmission facilities without specifying the junction points between facilities. 
Facility section—A portion of a circuit consisting of a transmission facility between two 

pieces of equipment. 

Facility specifications—The type and the physical specifications of each facility section 
in the circuit, or equivalently, the transmission parameters of each facility section 
that are relevant to the measurement of circuit transmission characteristics. 

Generic equipment model—A model whose primary parametric characterization is 
functional (e.g., transducer and return losses) rather than physical (e.g., resistance, 
inductance, and capacitance). 

Generic segment—A circuit segment that is functionally characterized apart from any 
specific equipment technology. 

Generic segment measures—The causes whose effects are the transmission performance 
measures. 

Incremental analysis—Circuit analysis where only variations of parameters around their 
nominal values are considered. 

Nominal circuit solution—The collection of nominal equipment parameters. 

Nominal equipment parameters—The values of equipment parameters determined on 
the basis of the given facility specifications (with no deviations) and transmission 
plan such that all the circuit parameters are at design-center values. 

Normalized circuit parameters—Parameters used to measure circuit performance. These 
parameters are derived from circuit parameters. 

One-sided equipment parameters—Equipment parameters that are constrained either 
from above or from below. 
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Transmission performance measures—Measures that can characterize the transmission 
performance of a circuit. 

Two-sided equipment parameters—Equipment parameters that are constrained both 
from above and from below. 


APPENDIX B 
The ‘Interpreter’ in the Statistical Optimization Program 
B.1 Algorithm for modification of parameter ranges 


For each random sample of parameters p;, 1 = 1, 2, ---, n, we 
determine whether or not the resulting circuit meets the worst-case 
requirements defined by the design objective. Therefore, at the end of 
each Monte Carlo run we have two histograms (or probability density 
functions) of the values of each unspecified parameter corresponding 
to the circuits that meet or do not meet such requirements. We indicate 
these histograms by “Y” and “N” as shown in Fig. 16. From these 
histograms we will determine the ranges of values for parameters p; 
over which the proportion of circuits meeting such requirements 
predominate. 

Figure 16 illustrates the case where parameter p; was assigned values 
in the original range, i.e., a <p; < c, and the circuits with low values 
of p; generally met the requirements. A new range for p; is determined 
from the intersections of “Y” and “N” histograms as shown in Fig. 16. 
By restricting the values of parameter p; to the new range (i.e., a < 
pi<) while the ranges of other parameters are unchanged, an increase 
will be expected in the proportion of circuits meeting the requirements 
in the next Monte Carlo run. This can be proved by the application 
of Bayes’ Theorem of Statistics,“ which relates a posteriori probability 
of events A; subject to the occurrences of event B, Pa(A;z), to a priori 
probabilities P(A;): 


Pa,(B)P(Ax) _ _Pa,(B)P(Ap) 


P,(Ar) = (55) 
P(B e 
(8) Py (B)P(A;) 
j=l 
In eq. (55), events A;, 7 = 1, 2, --- , nm are mutually exclusive and their 


union equals the universal event, and event B is arbitrary. Now 
consider the events 


A, = Y: circuit meets requirements 
A, = N: circuit does not meet requirements 
Bia<p;<b. (56) 
Note that 
P(N) = 1 — PCY). (57) 
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Fig. 16—Histograms of parameter p; corresponding to circuits that meet the require- 
‘pnts “Y” or do not meet the requirements “N”. 


Thus, for k = 1 and n = 2, eq. (55) becomes 


Py(p; < b)P(Y 
Sua v(Pi < b)P(Y) ae 
a<pi<b Py(a<p;<b)P(Y) + Py(a <p; < b)P(N) 

We want to show that the above algorithm for range modification at 
the end of each Monte Carlo run leads to an improvement on the 
probability of circuits meeting the requirements. That is, we want to 
prove that 

P( Y) > P(Y). (59) 

a<pi< 
Histograms “Y” and “N” in Fig. 16 represent conditional probability 
densities Py(p;) and Py(p;) for parameter (random variable) p; in 
circuits meeting or not meeting the requirements, respectively. We 
have, from Fig. 16, 


Py(pi) > Py(pi) for a<p; <b. (60) 
Thus, 


b b 
Py(a<p;<b)= { Py(p;)dp;i > i Py(pi)dp; 61) 


Py(a <p; <b). 


Using eqs. (57) and (61) in eq. (58) and noting that 0 < P(Y) <1 
yields eq. (59). 


B.2 Algorithm for choosing the parameter for treatment 


At the end of each Monte Carlo run we must determine the param- 
eter whose range of values most influences the proportion of circuits 
meeting the requirements. This can be accomplished by a sensitivity 
factor M; for each parameter p; as follows:*° 
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(b) 





(c) 
Fig. 17—Sensitivity factor M; for parameter p;. (a) M; ~ 2. (b) M; = 0. (c) 0 <M; < 2. 


M; = { [Py(pi) — Pn(pi) dpi. (62) 


Sensitivity factor M; is a measure of overlap of the probability density 
functions Py(p;) and Py(p;). In fact, it is the area of the nonoverlap- 
ping parts of these density functions. It varies between zero for the 
case of complete overlap and two for the case of no overlap. 

Figure 17 shows three different cases of overlap. Figure 17a corre- 
sponds to the case where circuits with sufficiently low values of p; 
meet the requirements independent of variations in any of the other 
parameters. That is, in this case, parameter p; is the determining factor 
in whether or not the circuit meets the requirements. 

Figure 17b corresponds to the situation where the value of parameter 
p; has no significant effect on whether or not the circuit meets the 
requirements. That is, the determining factor must be sought among 
other parameters. The case shown in Figure 17c is between the above 
two extreme situations. 

At the end of each Monte Carlo run, the parameter with the largest 
sensitivity factor will be chosen for range modification, while keeping 
the ranges of values for other parameters unchanged. 
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1980 Bell System Noise Survey of the Loop Plant 
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This paper presents the principal findings of measurements taken in 1980 
of voltages and current induced in Bell System loops by commercial power 
lines. These findings give designers of new, solid state, electronic switching 
and terminal equipment a detailed characterization of power induced signals. 
The last Bell System noise survey was in 1964, and since then power system 
load and telephone loop length have increased. Use of shielded cable has 
grown, and the number of poorly balanced party lines has decreased. The net 
result is that the loop plant contribution to main-station levels of induced 
longitudinal and metallic voltages indicate only a slight increase at the 90th 
percentile of 1 to 2 dB since 1964—an increase too small to be considered 
statistically significant. When the central office and loop plant contributions 
are combined, main station metallic noise shows a decrease at the 90th 
percentile of 6 dB since 1964. Because of diurnal variations, distributions of 
peak noise levels on a loop over a 24-hour period are 4 to 6 dB higher than 
corresponding distributions based on one-time measurements made during 
business hours. This survey provides the first characterization of induced 
longitudinal current based on measured data. Finally, as a result of the 
stratified, two-stage sampling scheme used for the survey, differences among 
urban, suburban, and rural environments have been identified, and the unique 
behavior of long rural loops has been highlighted. 


l. INTRODUCTION 
1.1 Survey motivation and objectives 


Since the earliest days of telephony, commercial power systems have 
imposed significant constraints on the design of the Bell System 


* AT&T Bell Laboratories; present affiliation Bell Communications Re- 
search, Inc. AT&T Bell Laboratories. 
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network. In fact, the evolution from open wire to shielded cable with 
twisted pairs was in part a response to the need for reduced suscepti- 
bility to induced noise. Despite the tremendous progress in reducing 
plant susceptibility through use of shielded cables, improved mitiga- 
tion techniques,’ and better-balanced plant, induction in the loop 
plant remains a major concern because of increasingly stringent design 
specifications and changing service requirements. For example, mod- 
ern solid state transmission equipment is often more vulnerable to 
damage from longitudinal voltages than were earlier generations of 
equipment. In addition to existing Bell System objectives, several 
public utility commissions have recently established noise standards 
for subscriber loops. This latter development, along with the restruc- 
turing of the Bell System, implies a need to develop programs to assess 
the performance of both power and telephone systems with respect to 
noise parameters. A detailed knowledge of noise in the loop plant is 
necessary to address these and similar issues. 

Since the last Bell System noise survey, which was conducted in 
1964,” significant changes have occurred in the power and telephone 
plants. The open-wire plant, which still served a significant percentage 
of loops in 1964, is virtually nonexistent today, and poorly balanced 
party lines, once a major cause of noisy loops, are rarely encountered. 
In contrast to these positive developments, power system loads have 
doubled since 1964, and homes, businesses, and utilities increasingly 
employ nonlinear devices, which can be potent sources of telephone 
interference. In addition, subscriber loop lengths have increased ap- 
proximately 14 percent since 1964.° The net effect of such changes 
cannot easily be predicted, but it is clear that the plant conditions 
present in 1964 no longer exist. Also, the 1964 noise data provide no 
information on longitudinal currents, noise spectra, central office 
voltages, or the effects of diurnal variations, all of which are now 
considered important to design and performance specifications.** Un- 
til now these parameters have been considered only by more localized 
measurement programs.® 

Thus, the pressing need for up-to-date noise characterizations and 
the limitations of previous systemwide noise data led the Electrical 
Protection and Interference Department of Bell Laboratories to con- 
duct the 1980 Bell System noise survey. The result of that survey is a 
characterization of induced noise at subscriber and central office ends 
of telephone pairs in the Bell System, with special consideration of 
the noisier long pairs. 


1.2 Scope of paper 


This paper discusses the design and presents the results of the 1980 
Bell System noise survey. The presentation emphasizes the informa- 
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tion most relevant to establishing induced noise characterizations. 
Section II briefly introduces the physical mechanisms and models 
associated with induction in the loop plant. Section III describes the 
survey design, defines measured and derived noise parameters, and 
concludes with a summary of the principal results. Section IV presents 
noise data measured at the central office interface, and Section V 
presents data measured at the main station interface. Finally, Section 
VI provides concluding comments. The appendix expands on the 
survey design, details the method of sample selection, and illustrates 
the mathematical technique used to derive means and cumulative 
distributions. 


Il. OVERVIEW OF POWER INDUCTION 


Power distribution lines are the principal source of induction in 
telephone lines, and are a natural consequence of both systems serving 
a common public and frequently sharing rights of way. Power line 
induction is a function of the magnitude of the net three-phase power 
currents and the distance over which telephone lines are exposed to 
these currents.”* Urban, suburban, and rural areas represent signifi- 
cantly different environments with respect to these parameters.® At 
one extreme are the densely populated urban areas, which tend to 
have relatively well-balanced, three-phase power systems and rela- 
tively short telephone loops. At the other extreme are rural areas, 
which are typically characterized by single-phase power distribution 
and long loops. Suburban areas generally encompass a wide range of 
intermediate conditions. 

The induced longitudinal (or common mode) disturbances,””® i.e., 
noise-to-ground and longitudinal current (as defined in Section 3.2), 
that result from power line exposures are essentially equal for both 
wires of a pair.’° Metallic noise (defined in Section 3.2) results pri- 
marily from electrical imbalances or asymmetries in the circuits 
formed by each wire of the pair. Since the metallic noise level depends 
on the longitudinal excitation, a measure of imbalance is the ratio of 
metallic noise to longitudinal noise (see Section 3.2 for a specific 
definition). 

Imbalances result from differences between series resistances and 
shunt capacitances of the two wires of a pair. (Components of this 
capacitance are discussed by Miller.’') These imbalances determine 
the level of noise appearing on a subscriber circuit. Therefore, wire 
pair cables are manufactured to minimize asymmetries, and circuit 
terminations are designed to achieve good balance.’*” 

Power induction influences telephone system design in two distinct 
ways. The 60- and 180-Hz spectral components, which usually domi- 
nate the induced signals, affect dissipation ratings, power arrange- 
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ments, and signaling characteristics of telecommunication equipment 
interfacing with the loop plant.®!*!4 The voiceband harmonics of 60 
Hz have an impact on transmission quality. The harmonics usually 
arise from nonlinearities of the power distribution transformers, 
which, because of symmetry conditions, predominantly generate odd 
multiples of 60 Hz.®'*!” Among these harmonics, the “triple-odd” 
harmonics of 60 Hz, i.e., odd multiples of 180 Hz, are of particular 
interest because they tend to add in phase on all three phases of a 
multigrounded-neutral (MGN) power distribution system. 

Diurnal variations in power system loads not only produce signifi- 
cant changes in induction levels but may also cause changes in spectral 
content.® For instance, an increase in the power system load increases 
the 60-Hz component of the current and produces greater 60-Hz 
induction in the telephone plant. However, harmonic levels may be 
reduced, because the increased power system current is accompanied 
by an increased voltage drop in the power lines. As a result of the 
increased drop, the voltage excitation of distribution transformer cores 
is slightly reduced, which leads to reduced harmonic generation.’*"® 

Although the above paragraphs highlight those aspects of induction 
phenomena most relevant to a basic analysis of the survey data, an 
appropriate measure of exposure length requires further discussion. 
From physical considerations it can be argued that total loop length* 
is the most direct measure of potential exposure, particularly in the 
case of measurements from the central office to an on-hook station. 
On the other hand, if measurements are made at a main station 
location, the choice between working length and total length is less 
obvious. However, examination of survey data indicated that main 
station noise distributions that used either working or total loop length 
as a parameter provided required information. Hence, we chose total 
loop length as the length parameter to be used in this paper. Figure 1 
shows the distributions of loop lengths of assigned working pairs." 

Once the length parameter had been chosen, it was desirable to 
define long loops since they were most likely to have the highest 
induction.” For the 1980 survey, long loops were arbitrarily defined as 
those having total lengths greater than 20 kft; all remaining loops 
were considered short. Two main considerations suggested this break- 
point. First, based on the 1973 loop plant survey,° this definition 


* Total length is the sum of working loop length plus all bridged tap length. Working 
length is the length of wire pair in which dc current flows when a station goes off hook; 
bridged tap length is the additional length of wire pair bridged onto the working length.® 

* Throughout this paper, distributions are characterized with respect to the popula- 
tion of working assigned pairs, unless stated otherwise. In 1980 there were approximately 
89,900,000 working assigned pairs in the Bell System, which excludes Southern New 
England Telephone and Cincinnati Bell. 
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STANDARD 
DEVIATION 
OF ESTIMATE 
MEAN OF MEAN 
eft) eft) 
URBAN 10.9 2.2 
SUBURBAN 15.4 1.1 
RURAL 17.1 2.1 
BELL SYSTEM 13.4 1.2 


——— URBAN 
—-—-—— SUBURBAN 


PERCENT OF ASSIGNED PAIRS WITH LENGTH LESS THAN ABSCISSA 


BELL SYSTEM 





0.5 1 2 5 10 20 50 100 
LENGTH IN K!ILOFEET 


Fig. 1—Total loop length for urban, suburban, rural, and Bell System loop popula- 
tions. 


assured the selection of a reasonable number of long loops without 
unduly increasing the complexity or size of the sample. Second, the 
authors believed that a number of applications of the survey results 
would require separating the loop population at approximately 20 kft, 
since that is approximately the dividing line between loaded and 
nonloaded loops. For example, loop electronics that operate on non- 
loaded loops would serve the short loop population. Also, the increased 
use of subscriber carrier svstems should continue to reduce the pro- 
portion of physical loops exceeding 20 kft in total length. 

Since induced signal levels are a function of both exposure length 
and details of the power system configuration, the measured levels for 
a given length have a large variance.’ In addition, for wire pairs 
significantly under 10 kft in total length, currents in local grounds will 
contribute a component to the noise’* that is not length dependent. 
As a result, induced noise on wire pairs in areas with heavy power 
usage, such as industrial areas near power plants, may not show a 
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strong length dependence until longer exposure lengths are reached. 
For very long exposures there is a decrease in the rate of increase of 
induced noise. Wire pairs can be modeled as lossy transmission lines.* 
As exposure lengths increase much beyond 30 kft, the mean level of 
induced signals increases much more with length than for exposures 
of intermediate length. This is a direct result of the attenuation of 
signals induced remotely from the observation point. 


Ill. OVERVIEW OF SURVEY 
3.1 Survey design 


A major objective of the 1980 noise survey was to characterize 
statistically the induced voltages and currents present at central office 
and subscriber loop interfaces. An additional goal was to identify the 
small percentage of Bell System loops experiencing high induction 
levels, without using a large sample size. To accomplish this, we 
stratified the Bell System into urban, suburban, and rural wire centers. 
Urban wire centers were defined as those serving more than 1000 
assigned pairs per square mile (ap/sq.mi.), suburban centers, between 
45 and 1000 ap/sq.mi., and rural centers, less than 45 ap/sq.mi. This 
stratification assured representation of noisy loops, which were as- 
sumed to be the long loops (over 20 kft in total length) in the suburban 
and rural environments. 

The survey sample was selected using a stratified, two-stage sam- 
pling scheme. The 36 sampled wire centers were geographically dis- 
persed as shown in the appendix: 6 were urban wire centers, 18 
suburban, and 12 rural. The loops to be tested were randomly selected 
from physical loops in a wire center. To limit the scope of the loop 
survey, we excluded loops providing special services or served by carrier 
systems. We tested 30 to 40 assigned pairs in each wire center. Table 
I summarizes the number of tested loops in the survey and compares 
the proportions of loops in the survey and Bell System populations. A 
relatively high proportion of rural loops was used to identify high 
induction loops. The appendix gives details on the survey design. 

Once the measurements were completed, cumulative distributions 
and means for the various noise parameters were obtained for urban, 
suburban, and rural strata and for the Bell System. In addition, 
distributions were calculated for short loops in each strata and long 
loops in the nonurban strata. The statistics were calculated using a 
general ratio technique as described in the appendix. 


* These models can be developed from the results in a paper by Parker,® Appendix A 
of a paper by Wilson,’ or Appendix E of a book by Smith.”® 
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Table I—Composition of tested loop and Bell System loop populations 


(a) Loops in Survey 
































Percent of Percent of | Percent of Percent of 
Number of Total Number of Total Number of Total Number of Total 
Tested Surveyed Tested Surveyed Tested Surveyed Tested Surveyed 
Loops Loops Loops Loops Loops Loops Loops Loops 





Long loops 20 1.6 369 29.4 
Short loops 161 12.8 887 70.6 
Total 181 14.4 1256 


(b) Percent of Bell System Loops in 1980 


Long loops 6 12 3.5 21.5 
Short loops 43 29 6.5 78.5 
Total 49 41 10 100 


3.2 Noise parameters and their measurements 


The measured and derived noise quantities, which are illustrated 
for a particular configuration in Fig. 2, are defined as follows: 

1. Noise to ground, Ng, is the average of the tip and ring conductor 
voltages to ground. Conventionally, it is measured using a 3-kHz flat 
or a C-message weighted filter. The resulting rms values are expressed 
in units of dBrn and dBrnC, respectively. Figure 1 shows the conver- 
sion from rms voltage to dBrn; the same formula holds for the 
conversion from C-message weighted rms voltage to dBrnC. 

2. Metallic noise, Nm, is the voltage across a nominal 600-ohm 
resistor connected between the tip and ring conductors of a pair. The 
magnitude of the metallic noise is expressed in units of dBrn or dBrnC. 
C-message weighted metallic noise is a direct electrical measure of the 
noise perceived by the customer.’® 

3. Longitudinal current, I,, is the sum of the tip and ring short- 
circuit currents to ground. During the survey this current was deter- 
mined from the voltage across a 10-ohm resistor connecting tip and 
ring to ground. 

4, Longitudinal impedance is calculated as the ratio of longitudinal 
voltage to longitudinal current. Only the magnitude of the impedance 
is available from the survey data. 

5. Balance, B, is 20 times the logarithm of the ratio of the C-message 





LOOP CONDUCTORS 


Nm — METALLIC NOISE 
= 20 log (IV_/24.5 wV1) dBrn 


Ng — NOISE TO GROUND 
= 20 log (IV, /24.5 pV 1) dBrn, 


WHERE: 

Vg = (Vz + Vp)/2 Virms) 

1, — LONGITUDINAL CURRENT 
= (I, + 12) Alrms) 


B — BALANCE 
= 20 log (IV, /V_,1) dB 
=N.-N9 ™ 


m 


Fig. 2—Measured noise parameters. 


782 TECHNICAL JOURNAL, MAY-JUNE 1984 


weighted noise-to-ground voltage to the C-message weighted metallic 
noise voltage, or, equivalently, N,-N,, (dBrnC). 

A one-time set of measurements of the above parameters was made 
at main station and central office interfaces on 1256 loops at random 
times between 8:30 AM and 4:30 PM local time. These business hour 
measurements were made essentially simultaneously at both ends of 
the loop. The one-time business hour measurements at the main 
station provided the data most directly comparable to that of the 1964 
survey. 

To characterize the diurnal variation and spectral content of the 
noise parameters, automated measurements were made at central 
office and main station interfaces for at least 24 hours. Automated 
measurements at the central office were made for all noise parameters 
for about 950 of the tested loops. At the main station, automated 
noise-to-ground measurements were made for 195 loops. 

Corresponding to one-time and automated measurements, two types 
of cumulative distributions are used to describe the electrical param- 
eters. Consistent with previous surveys, business hour distributions 
are calculated from data obtained by the one-time measurements made 
on each loop. Daily maximum and daily median distributions are 
calculated from the maximum and median levels reached on each loop 
in a 24-hour period as determined by the automated measurements. 
In the following, business hour distributions are discussed unless 
stated otherwise. 


3.3 Principal findings 


Tables II and III summarize induction levels at main station and 
central office interfaces in terms of 3-kHz flat and C-message weighted 
noise to ground, longitudinal current, and metallic noise. (Distribu- 
tions for the more important parameters are given in Sections IV and 
V. However, a normal distribution can be assumed for the other 
parameters as long as it is understood that the actual tails of the 
distributions are not modeled well. The distributions presented in 
Sections IV and V provided the basis for understanding the limitations 
of such models.) The estimates in Tables II and III generally have 90- 
percent confidence intervals of +2 dB. The appendix discusses the 
derivation of these confidence intervals. Maximum levels in Tables II 
and III represent measured data obtained by considering both the 
1980 survey and its pilot, conducted in 1979. 

In addition to these statistics, other findings are as follows: 

1. Noise to ground, metallic noise, and longitudinal current at the 
central office have been characterized for the first time (see Table II). 
The 90th percentile of the longitudinal current is -12 dBmA (0.25 
mA). 
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Table I|—Central office noise to ground, 
longitudinal current, and metallic noise as 
measured to on-hook stations 


90th 
Percen- 
Mean Median tile Maximum* 
(a) Central Office Noise to Ground (dBrn) 

Urban 65 64 79 100 
Suburban 73 72 88 114 
Rural 719 81 92 110 
Bell System 70 68 87 114 
Bell System (DM)* 73 91 


(b) Central Office C-Message Weighted Noise to Ground 
(dBrnC) 


Urban <40 47 17 
Suburban 46 66 85 
Rural 54 69 83 
Bell System 41 61 85 
Bell System (DM)t 45 62 
(c) Central Office Longitudinal Current (dBmA) 

Urban —40 —26 —2 
Suburban —30 -8 25 
Rural —23 0 21 
Bell System ~34 -12 25 
Bell System (DM)* —32 —7 

(d) Central Office C-Message Weighted Metallic Noise (dBrnC) 
Urban 10 38 
Suburban 17 46 
Rural 19 53 
Bell System 15 53 
Bell System (DM)t 21 


* Largest values measured at any time as part of the survey. 

t Results are from distributions of daily maximums on mea- 
sured loops. All other statistics except maximums are for one- 
time measurements made during business hours. 


2. The median of the main station noise to ground as measured 
during business hours was 61 dBrnC. Although this represents a 
measured increase of 2 dB since 1964, the increase cannot be consid- 
ered statistically significant. The median of the daily maximum noise 
to ground, which had not been previously characterized, was 67 dBrnC. 
This difference in medians indicates a substantial diurnal variation in 
noise magnitudes. The 540-Hz component (ninth harmonic of 60 Hz) 
typically dominates C-message weighted noise to ground. 

3. The 90th percentile of main station metallic noise on the loop 
plant, as measured during business hours, is 11 dBrnC, and 4 percent 
of the loops exceed the traditional reference levels of 20 dBrnC. When 
contrasted to the comparable 1964 distribution calculated from Gresh,” 
there is an increase of 1 dB in the 90th percentile value. However, this 
increase cannot be considered statistically significant. 
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Table III—Main station noise to ground, 
longitudinal current, and metallic noise as 
measured to main distributing frames 


90th 
Percen- 
Mean Median tile Maximum* 
(a) Main Station Noise to Ground (dBrn) 
Urban 86 87 103 111 
Suburban 96 95 108 134 
Rural 96 95 108 117 
Bell System 91 92 106 134 
Bell System (DM)* 96 110 
dBrnC) 
Urban —_— 57 70 90 
Suburban 68 67 83 94 
Rural 70 69 85 101 
Bell System — 61 80 101 
Bell System (DM)* 67 82 
(c) Main Station Longitudinal Current (dBmA) 

Urban 9 9 20 30 
Suburban 16 16 28 47 
Rural 15 17 28 AJ 
Bell System 13 13 25 47 
(d) Main Station C-Message Weighted Metallic Noise (dBrnC) 
Urban 5 31 
Suburban 12 37 
Rural 16 51 
Bell System 11 51 


* Largest values measured at any time as part of the survey. 

t Results are from distributions of daily maximums on mea- 
sured loops. All other statistics except maximums are for one- 
time measurements made during business hours. 


4, The 90th percentile of main station metallic noise measured to a 
central office quiet termination during business hours is 13 dBrnC. 
Comparable results for 1964” indicate a 90th percentile of 19 dBrnC. 
For reasonable assumptions on the variance of the 1964 estimates, it 
is concluded the 90th percentile level has decreased approximately 6 
dB. This measurement of main station metallic noise includes both 
loop plant and central office contributions. Since the loop component 
has remained approximately constant, the improvement is attributed 
to a reduction in the central office contribution to the metallic noise. 

5. Central office metallic noise measurements are usually of interest 
as an indication of metallic noise at the main station. Simultaneous 
measurements at the two interfaces indicate that central office metallic 
noise measured to an on-hook main station is generally higher than 
main station metallic noise measured to the quiet termination of the 
central office. Based on the measured central office diurnal variations, 
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daily maximum distributions of main station metallic noise are be- 
tween 3- and 6-dB higher than the business hour distributions for 
corresponding percentiles. 

6. Power lines induce significantly different levels of noise to ground 
and longitudinal currents in urban and nonurban loops. This differ- 
ence exists because nonurban environments have longer loops and 
typically higher net power currents than urban environments. Urban 
short loop distributions of longitudinal noise parameters are typically 
6- to 10-dB lower than the nonurban short loop distributions. The 
importance of loop length is illustrated by the fact that for a nonurban 
stratum the median noise to ground for short loops (with total lengths 
less than or equal to 20 kft) is about 15-dB lower than for long loops 
(with lengths longer than 20 kft). 


IV. CENTRAL OFFICE RESULTS 


The need for information on noise parameters at the interface 
between the central office and the loop has grown as new generations 
of solid state loop and trunk terminating equipment have appeared.*° 
The 1980 Bell System noise survey provides the first relatively com- 
plete characterization of induction at this interface. Since central 
office loop testing systems, such as the Mechanized Loop Testing 
system (MLT),”° usually make noise measurements to on-hook main 
stations, data obtained under similar conditions (see Fig. 3) are of 
primary concern. However, since central office metallic noise depends 
on the metallic termination at the main station, it can be argued that 
central office metallic noise with an off-hook station is a more relevant 
measure of transmission quality. For this reason, metallic noise data 
obtained with an ungrounded 6320 resistor replacing the on-hook 
station set (see Fig. 3a) are also considered. 


4.1 Central office noise to ground 


The central office measurements confirm the postulates underlying 
the induction model used during the survey design (see Section II). 
Those postulates are: urban environments generally have shorter loops 
than nonurban; induction increases with loop length; urban loops, as 
a result of well-balanced power systems, have lower induction-per- 
unit length than nonurban loops; and nonurban long loops have the 
highest induction levels. Business hour distributions of central office 
C-message noise to ground given in Fig. 4 support the conclusion that 
urban loops have less induction that nonurban loops. The distributions 
of total loop lengths given in Fig. 1* confirm that urban environments 
have a higher proportion of short loops than nonurban environments. 


*In every plot of cumulative distributions, a straight line would imply a normal 
distribution. 
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Fig. 3—Central office measurement configurations used in survey. 


As a basis for further consideration of the postulates, Fig. 5 presents 
the noise-to-ground distributions for nonurban long and short loops, 
where long loops are greater than 20 kft in total length and short loops 
are less than or equal to 20 kft in total length. The urban long loop 
distribution, which is not shown, had a median of 42 dBrnC and a 
95th percentile of 63 dBrnC. Because the variance associated with the 
estimator of this distribution is large, as a result of the limited number 
of urban long loops in the sample (Table Ia), the distribution is not 
plotted. (Because of the small sample size, urban long-loop distribu- 
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NOISE TO GROUND IN dBrnc 


Fig. 4—Central office C-message weighted noise to ground for urban, suburban, rural, 
and Bell System loop populations. 


tions will not be plotted at all in this paper.) However, a reasonable* 
normal fit to the distribution confirms that urban long loops have 
lower induction levels than nonurban long loops. A comparison of the 
nonurban long and short loops on Fig. 5 shows that induction generally 
increases with loop length. Based on the short loop distributions, the 
urban loops are seen to have lower induction levels than the nonurban 
environments. This is consistent with the assumption that urban 
power systems are typically well balanced. 

Figure 6 illustrates the effects of diurnal variation by presenting the 
distributions of the 24-hour median and maximum levels of C-message 
and 3-kHz flat weighted noise to ground. The daily median and 


* A reasonable fit implies that based on a Kolmogorov-Smirnov test, the fit could 
not be rejected at the 5-percent significant level.”! 
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Fig. 5—Central office C-message weighted noise to ground for long (>20 kft) and 
short (=20 kft) loops. 


maximum distributions are derived from the median and largest read- 
ing, respectively, of the 24 hourly measurements made on each loop. 
The 3-kHz flat and C-message weighted distributions of maximums 
are displaced positively from the distributions of the daily medians by 
approximately 6 and 4 dB, respectively. The Bell System distribution 
of the 24-hour median levels is nearly identical to the business hour 
distribution of noise to ground calculated independently from one- 
time measurements, as can be seen by a comparison of C-message 
weighted distributions on Figs. 4 and 6. 

Figure 7 presents the distributions of the noise-to-ground harmonic 
levels at the odd multiples of 60 Hz. (Even harmonics of significant 
amplitudes are sufficiently rare that their presence constitutes a 
powerful noise diagnostic tool.) The Bell System distribution of each 
component has been characterized by its median and quartiles. For 
each component the distribution was derived from the daily median 
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Fig. 6—Central office median and maximum noise to ground in a 24-hour period. 


levels occurring on each loop. The spectral components decrease 
monotonically as frequency increases, except for the 9th and possibly 
the 15th harmonics, which are triple odd harmonic components, as 
defined in Section JJ. The ninth harmonic (540 Hz) generally proves 
to be the primary interfering frequency when C-message weighting is 
applied. 


4.2 Central office longitudinal current 


Because the majority of central office terminations have a low 
longitudinal impedance under operating conditions, longitudinal cur- 
rent provides a more direct characterization of longitudinal induction 
at the central office interface than does longitudinal voltage. Despite 
the importance of longitudinal current, only estimates based on the 
1964 survey were available previously. The 1980 current data are 
summarized in Table II, and distributions are given in Fig. 8. 

Longitudinal current and noise to ground are not independent 
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Fig. 7—Distributions of spectral levels of central office noise to ground. 


variables, since they are related by the longitudinal input impedance 
of the loop. The magnitude of the longitudinal impedance is defined 
as the ratio of the magnitudes of central office noise to ground to 
longitudinal current. A scatter plot of longitudinal impedance at 540 
Hz versus total loop length is presented in Fig. 9 for a subset of loops 
included in the survey, and is indicative of the behavior of the total 
loop population. Figure 9 shows that the impedance decreases essen- 
tially linearly with increasing loop length to beyond 30 kft, where 
transmission line loss effects become important. (A transmission line 
model with reasonable assumptions on loop resistance and capacitance 
to ground can be used to derive comparable results.)’* 

The longitudinal impedance at the central office is determined by 
the capacitance to ground of the cable pair, and the slope in Fig. 9 can 
be used to estimate the cable capacitance-per-unit length. The capac- 
itance corresponding to the slope of Fig. 9 is 0.037 uF /kft (0.2 wF/mi). 

Since longitudinal impedance decreases with loop length whereas 
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Fig. 8—Central office longitudinal current for urban, suburban, rural, and Bell 
System loop populations. 


noise to ground generally increases with loop length (Section 3.1), 
longitudinal current should exhibit a stronger dependence on loop 
length than does noise to ground. To illustrate this point, Fig. 10 
presents current distributions for nonurban long and short loops. The 
separation between current distributions for long and short loops in 
either nonurban environment is about 10 dB greater than the sepa- 
ration of corresponding noise-to-ground distributions (see Fig. 5). 
The capacitive nature of the longitudinal impedance for a large 
percentage of loops, i.e., 95 percent are under 30 kft, causes the 
spectrum of the longitudinal current to differ from the spectrum of its 
driving source, the noise to ground. As a direct consequence, the 
separation between 3-kHz flat and C-message noise-to-ground meas- 
urements on each loop differs from the separation of the corresponding 
longitudinal current measurements. Figure 11 shows that a separation 
of 10 to 15 dB exists between 3-kHz flat and C-message weighted 
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Fig. 9—Central office longitudinal impedance at 540 Hz versus total loop length. 


distributions of longitudinal current. This contrasts sharply with the 
approximately 30-dB difference between 3-kHz flat and C-message 
weighted distributions of noise to ground (see Fig. 6). 


4.3 Central office metallic noise 


Metallic noise is produced by induced longitudinal voltages and 
currents acting on imbalances that are located at loop terminations 
and distributed throughout the loop. In addition, coupling from other 
loops in the cable can contribute to measured metallic noise. Figure 
12 provides the central office distributions of metallic noise measured 
with on-hook station sets (see Fig. 3a). Figure 13 provides the central 
office distributions of metallic noise measured with a 6322 termination 
replacing the on-hook station and simulating an off-hook station set. 
Included on the latter plot is the Bell System distribution of metallic 
noise for the on-hook condition. In general, it is found that the on- 
hook distributions for the various strata show significantly higher 
levels than the corresponding off-hook distributions. 

Since induction increases with loop length and the imbalances are 
generally cumulative, metallic noise levels are generally greater on 
longer loops. Figure 14 presents the on-hook distributions of C- 
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Fig. 10—Central office longitudinal current for long (>20 kft) and short (<20 kft) 
loops. 


message weighted metallic noise for nonurban long loops and for short 
loops. The distributions for the long and the short loops are widely 
separated at low noise levels, but with the exception of long rural 
loops, approach one another in the region of the upper tails. The long 
rural loops with greater than 30-dBrnC metallic noise represent about 
0.15 percent of the total loops in the Bell System. 

Bell System metallic noise distributions of daily median and maxi- 
mum levels for on-hook conditions at the main station show the same 
general relationship observed for the noise to ground. Harmonic levels 
of central office metallic noise show a behavior similar to the noise- 
to-ground harmonic curve (e.g., a local peak occurs at 540 Hz). 


V. MAIN STATION RESULTS 


The 1980 noise survey measurements made at main station inter- 
faces update and expand data on noise voltages, and provide new data 
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Fig. 11—Central office median and maximum longitudinal current in a 24-hour 
period. 


on longitudinal currents. Since under normal operating conditions the 
loop has a balanced, grounded termination at the central office, main 
station longitudinal and metallic voltage measurements were made to 
balanced terminations at the central office. Two balanced office ter- 
minations were used. To determine the loop plant contribution to 
metallic noise, a well-balanced, grounded resistive termination was 
installed at the main distributing frame with the central office discon- 
nected from the loop (see Fig. 15a). The second termination was 
established by calling into the quiet termination. The quiet termina- 
tion is a 9002 balanced test termination located at the central office. 
Metallic noise measurements made to the latter termination are influ- 
enced by noise present on intraoffice cables and any imbalance asso- 
ciated with these cables or the office quiet termination. To determine 
the maximum longitudinal currents that may exist at the main station, 
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Fig. 12—Central office C-message weighted metallic noise to on-hook main station 
for urban, suburban, rural, and Bell System loop populations. 


the short-circuit longitudinal currents were measured with tip and 
ring grounded through 10 ohms at the central office (see Fig. 15b). 
Table II summarizes noise to ground, longitudinal current, and 
metallic noise data as measured at main station interfaces with a 
balanced, grounded termination at the main distributing frame. (Meas- 
urements to the quiet termination are considered in Sections 5.1 and 
5.3.) The Bell System noise to ground medians of 92 dBrn and 61 
dBrnC given in Tables IIa and b are found to be 2 dB higher than 
computed 1964 median levels. The 90th percentile of the metallic noise 
in Table IId is 1 dB higher than the 90th-percentile level of 10 dBrnC 
computed from 1964 distributions.” Thus a consistent pattern of higher 
1980 levels is indicated; however, the confidence intervals on these 
estimates do not permit a statistically meaningful determination of 
the increase. Computations were necessary in order to compare the 
1980 and 1964 data because Gresh presented the 1964 distributions 
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Fig. 13—Central office C-message weighted metallic noise to off-hook main station 
for urban, suburban, rural, and Bell System loop populations. 


with respect to the population of main stations, whereas this paper 
presents distributions with respect to the population of assigned pairs. 
Transformation of 1964 distributions to an assigned pairs basis re- 
quired using the distributions for the individual and party lines as 
published by Gresh and calculating weighted means of the percentiles 
for each noise level. The weights are the proportion of assigned pairs 
serving individual and party lines in 1964, 88 percent and 12 percent, 
respectively. The transformed distributions differed only slightly from 
those obtained directly from Gresh’s distributions for individual lines. 


5.1 Main station to ground 


Figure 16 presents distributions for C-message weighted, main sta- 
tion, noise to ground as taken in 1980 during business hours. Four 
statistics are shown for each curve: the median, the mean, the standard 
deviation of the estimate of the mean (sdem), and the standard 
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Fig. 14—Central office C-message weighted metallic noise to on-hook station for long 
(>20 kft) and short (<20 kft) loops. 


deviation of the distribution(s). The 90-percent confidence interval 
for the mean is 1.67 times the sdem. As an example of the significance 
of these distributions, Fig. 16 shows that the Bell System lines that 
exceed 90 dBrnC, a commonly recognized maintenance objective, are 
largely in nonurban areas. The percentage of Bell System loops ex- 
ceeding 90 dBrnC is estimated to be 1.4 percent; the upper bound to 
the -95-percent confidence interval for this estimate is 4 percent. 

Noise-to-ground measurements presented in this section were made 
to balanced resistive terminations at the main distributing frame (see 
Fig. 15a). The derived distributions are identical to those obtained 
using quiet terminations in the office. This implies that no significant 
longitudinal voltage source is present in intraoffice wiring. 

Figure 17 presents the distribution of noise to ground for nonurban 
long and for short loops, where short loops are less than or equal to 
20 kft in total length and long loops are greater than 20 kft in total 
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Fig. 15—Main station measurement configurations. (a) Metallic noise and noise to 
ground measurements. (b) Longitudinal current measurement. 


length. The medians of long and short loops for a given nonurban 
population differ by 13 to 15 dB. As was the case for central office 
measurements (see Section 3.1), this dependence on loop length ac- 
counts for much of the difference between urban and nonurban distri- 
butions of noise to ground. 

For typical subscriber loop terminations, which have a ground at 
the central office and no ground at the main station, the 3-kHz flat 
weighted noise to ground is the most direct measure of power exposure. 
Figure 18 presents the distributions of this important parameter. The 
3-kHz flat weighted distributions are displaced by approximately 30 
dB from the corresponding C-message weighted distributions of Fig. 
16. 

The changes in power load responsible for the central office diurnal 
noise variations also produce diurnal variations at the main station. 
The magnitude of the effect on 3-kHz flat and C-message weighted 
noise to ground is illustrated by Fig. 19, as well as by Table II. This 
information and the curves of Figs. 16 and 17 show that the business 
hour distributions are significantly lower than the daily maximum 
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Fig. 16—Main station C-message weighted noise to ground for urban, suburban, 
rural, and Bell System loop populations. 


distributions. At the 90th-percentile points, the difference is estimated 
to be 4 dB for the 3-kHz flat weighted data and 2 dB for the C-message 
weighted data. The differences remain essentially constant for percen- 
tiles greater than the 90th. These findings are consistent with the 
results of Heirman.® 

Figure 20 describes the relative spectral content of main station 
noise to ground. For Fig. 20 all hourly measurements on all loops for 
which diurnal measurements were made were equally weighted. The 
harmonic levels for each waveform were divided by the 3-kHz flat 
weighted magnitude to obtain a normalized amplitude. An analysis of 
these normalized harmonic levels indicates that the spectral compo- 
nents at 60 and 180 Hz dominate the 3-kHz flat weighted voltage 
spectrum. When C-message weighting is applied to the measured data, 
the 540-Hz component is generally dominant. The remaining odd 
harmonics between 300 and 900 Hz (the 5th through the 15th har- 
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Fig. 17—Main station C-message weighted noise to ground for long (>20 kft) and 
short (<20 kft) loops. 


monic) are of lesser but approximately equal importance in determin- 
ing C-message weighted noise levels. 


5.2 Main station longitudinal current 


Station equipment employing terminations with a low impedance 
to ground can be subjected to substantial induced longitudinal cur- 
rents. The first measured characterization of these main station cur- 
rents is summarized in Fig. 21. As a point of reference, the current on 
1 percent of Bell System assigned pairs exceeded 80 mA (88 dBmA). 
Currents measured at the main station are significantly higher than 
at the central office because of the change in termination conditions 
(compare Figs. 3c and 15b). 


5.3 Main station metallic noise 


Figure 22 presents the main station C-message weighted metallic 
noise distributions obtained with a termination at the main distrib- 
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Fig. 18—Main station noise to ground for urban, suburban, rural, and Bell System 
loop populations. 


uting frame. Since 20 and 30 dBrnC are standard levels of reference,” 
the corresponding percentiles are indicated on Fig. 22. Figure 23 
presents distributions for the nonurban long loops and short loops. 
Although nonurban long loops generally have higher longitudinal 
induction levels than short loops, Fig. 23 shows that except for the 
long rural loop distributions, the difference between the various me- 
tallic noise distributions becomes smaller in the upper tail of the 
distributions. A reasonable conjecture explaining the observed conver- 
gence of distributions in the region of 30 dBrnC is that operating 
company engineers make special efforts to limit metallic noise on high 
induction loops. . 

Up to this point the discussion has centered on noise measured with 
a balanced resistive termination at the main distributing frame. How- 
ever, Bell System recommended objectives for metallic noise refer to 
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Fig. 19—Maximum and average main station noise to ground in a 24-hour period. 


noise as measured to the central office quiet termination. Figure 24 
presents main station distributions of metallic noise measured under 
this condition. Since the quiet termination is accessed by a call to the 
central office, the metallic noise is influenced by the imbalances and 
noise introduced by battery plant and intraoffice cables, and noise 
within the central office switch. The effect of these sources can be 
seen in Fig. 25, which presents the Bell System distributions for 
metallic noise measured with the main distributing frame and quiet 
terminations. For a given percentile, metallic noise measured to the 
quiet termination is generally greater than the metallic noise measured 
to the main distributing frame. The two distributions show a maximum 
separation of 6 to 7 dB but approach one another in the upper tail of 
the distributions. This apparent convergence, which was not evident 
in results obtained in 1964,” could be attributed to statistical uncer- 
tainties in the tails of the data. 

In recent years the relationship between measurements of metallic 
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Fig. 20—Distributions of normalized spectral components of main station noise to 
ground. 


noise at the central office and at the main station has been of 
considerable interest. Although much work remains to be done, Fig. 
25 provides the basis for a comparison of four metallic noise distribu- 
tions. Two curves present distributions of central office metallic noise 
measured with on-hook and simulated off-hook main station termi- 
nations (see Section 4.3). The second set of two curves present distri- 
butions of main station metallic noise obtained with quiet and main 
distributing frame terminations. Figure 25 shows that the distribution 
of central office metallic noise measured to an on-hook station is 
generally higher than the other distributions and that all the distri- 
butions approach one another in the upper tails. As can be seen, the 
two central office distributions bracket the main station measurements 
in the range of greatest interest. This suggests that the application of 
special main station terminations may permit estimates of the upper 
tail of the subscriber noise distributions from central office measure- 
ments. 
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Fig. 21—Main station longitudinal current for urban, suburban, rural, and Bell 
System loop populations. 


5.4 Balance 


Balance is a measure of loop plant susceptibility to power system 
induction that complements the direct measures of power induction 
such as noise to ground and metallic noise. Balance as used in grade 
of service studies” and used by operating company engineers is defined 
as the difference in decibels between the main station C-message 
weighted noise to ground and metallic noise. This definition is implic- 
itly based on the assumption that power induction is the main source 
of metallic noise. If this assumption is not satisfied—a condition that 
occurs when a significant component of metallic noise comes from 
sources such as crosstalk or office noise—then the result of a balance 
calculation may be misleading. Thus, to avoid this difficulty while still 
characterizing the susceptibility of the 1980 loop plant, balance was 
calculated only for those loops with a main station metallic noise of 
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Fig. 22—Main station C-message weighted metallic noise as measured to main 
distributing frame for urban, suburban, rural, and Bell System loop populations. 


greater than 0 dBrnC, a procedure which eliminated about 50 percent 
of the tested loops. 

A scatter plot of balance versus noise to ground is given in Fig. 26 
and suggests an interesting phenomenon. As we discussed in Section 
4.3, metallic noise is generally below 30 dBrnC even on loops with 
high induction levels. Figure 26 indicates that the minimum balance 
improves as the induction increases and that the improvement in most 
instances is just sufficient to limit the worst-case metallic noise to 30 
dBrnC. Since the increasing minimum balance cannot be attributed 
to a physical mechanism, this behavior suggests that operating com- 
pany personnel have taken special measures to limit metallic noise to 
levels recommended in Bell Operating Company requirements. The 
cluster of plotted points on Fig. 26 with high induction and high 
balance indicates that a balance of 70 to 80 dB is achievable on noisy 
loops. 
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Fig. 23—Main station C-message weighted metallic noise as measured to main 
distributing frame for long (>20 kft) and short (<20 kft) loops. 


VI. SUMMARY 


This report of the findings of the 1980 noise survey describes induced 
noise voltages and currents observed at subscriber and central office 
loop interfaces. The noise-to-ground, metallic noise, and longitudinal 
current measurements made as part of the survey have resulted in 
statistical descriptions of amplitude distributions, diurnal variations, 
and spectra of these quantities. As a result of the stratification used, 
differences between urban and nonurban environments have been 
identified, and the unique behavior of rural long loops has been 
highlighted. Moreover, it is evident that the operating company engi- 
neer plays an important role in controlling the metallic noise at 
subscriber locations. 

This comprehensive characterization of the response of the loop 
plant to power line induction satisfies a pressing need for the data 
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Fig. 24—-Main station C-message weighted metallic noise as measured to quiet 
termination for urban, suburban, rural, and Bell System loop populations. 


necessary to establish loop noise performance criteria and equipment 
design standards. In the long term, the insights into the inductive 
interference process provided by the 1980 survey will also lead to 
improved techniques for diagnosis and mitigation of induced noise 
problems. 
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APPENDIX 


Sample Design 


This appendix describes the sample design of the 1980 noise survey. 
Section I details the method of selecting the sample and concludes 
with a description of the final sample. Section II describes the statis- 
tical techniques that were applied to the measured data and illustrates 
them with an example. The basic concepts of the sample design 
described here have been investigated in more detail by I. Nasell”® and 
J. A. Maher.”4 
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Fig. 26—Main station balance versus C-message weighted noise to ground. 


A.1 Sample selection 


A stratified, two-stage sampling plan was used to select sample units 
from the universe of working assigned pairs in the Bell System. Wire 
centers were stratified according to their assigned pairs per square 
mile (ap/sq.mi.). Urban centers served more than 100 ap/sq.mi., rural 
centers less than 45 ap/sq.mi., and suburban centers the intermediate 
densities. In the first stage of sampling, wire centers in each stratum 
were chosen with a probability proportional to size; in the second 
stage, assigned pairs were chosen randomly in each office. If a multi- 
party line was selected, then one of the main station locations was 
randomly chosen as the test point. 

The next several paragraphs describe the method used to implement 
this sampling plan. The first step was to obtain a description of all 
the wire centers in the Bell System through the use of 1978 plant 
utilization data and a 1976 survey of the Bell System Outside Plant 
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Engineering records. Using the CLLI (Common Language Location 
Identifier) code as an index, comparisons were made between these 
two databases to obtain the number of assigned pairs and the area 
served by each wire center. (Neither database had sufficient informa- 
tion to be used alone.) Since CLLI codes can be reassigned as wire 
centers are created and restructured, it was necessary to verify that a 
unique wire center had been identified. If the 1976 and 1978 assigned 
pairs differed by less than 12 percent, i.e., the expected growth rate, it 
was assumed that a unique identification had been made. 

The next step in implementation of the sampling procedure required 
defining stratum boundaries. This process was based in part on a 
characterization of central offices from which the sample of long loops 
had been selected for the 1964 loop survey.” The results of that study 
interpreted in light of the expected differences among the urban, 
suburban, and rural induction environments suggested a boundary 
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Fig. 27—Sampled wire center size versus serving density. 
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between rural and suburban offices of 45 ap/sq.mi. The choice of the 
urban boundary of 1000 ap/sq.mi. was based on the desire to balance 
the number of assigned loops in the urban and suburban categories. 

Finally, a pilot survey in 1979, which included wire centers from the 
three strata, provided the basis for determining the final sample size. 
The criterion for the 1980 survey was that the mean noise to ground 
at the main station for each stratum should be determined with a 90- 
percent confidence interval of 2 dB. Although it is not immediately 
evident, post-survey analysis showed that the sample size arrived at 
based on this criterion was sufficiently large to include the noisier, 
longer loops of the rural and suburban populations. The final sample 
consisted of approximately 1256 loops, and the testing took 10 months. 

Tables IV and V and Figs. 27 and 28 summarize the characteristics 
of the sample and locate the test sites. Table IV presents the proportion 
of loops and wire centers in each stratum and in the final sample. As 
we discussed in Section A.2, knowledge of the proportions of loops in 
each stratum is necessary to estimate statistical noise parameters. The 
relatively high porportion of suburban and rural loops in the sample 
results from the survey goal of characterizing the longer, noisier Bell 
System loops in a statistically meaningful way. 

Surveys of the Bell System frequently use office size rather than 
office density as a criterion for stratification. Figure 27 illustrates the 
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Fig. 283—Location of survey test sites. 
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Table IV—Percent of loops and central offices in 
urban, suburban, and rural environments 
Urban Suburban Rural 


Sample (%) (%) (%) 
Assigned pairs in Bell System 49 41 10 
Wire centers in Bell System 13 34 53 
Loops in survey 14 51 34 
Wire centers in survey 17 50 33 


range of office sizes and densities present in the 1980 survey. Larger 
offices generally have higher serving densities, but as might be ex- 
pected, a high correlation does not exist. 


A.2 Estimates of statistics 


This section describes the method used to estimate the means and 
cumulative distributions of the various noise parameters and the 
method used to estimate the variances of these estimates. The data 
used to form these estimates are assumed to have no significant 
experimental bias. 

Ratio estimators were used to estimate means and cumulative dis- 
tributions.”* The form of the estimator given below is appropriate to 
the case of wire centers selected with a probability proportional to size 
and a random sampling of pairs within the centers.”* The size assumed 
during the sample selection was the number of assigned pairs in each 
wire center in 1978. During the survey more recent information on 
office size was obtained. The formulas presented here and used in this 
report assume that the size of each wire center did not change from 
1978 to the time of the survey. Nasell allows for a correction if wire 
center size did change; several distributions recalculated using this 
correction did not change significantly. 

The forms of the estimators are summarized below. The statistic to 
be calculated for the Bell System population is assumed to be f. This 
statistic could be the mean of a parameter, or it could be the proportion 
of loops that have a parameter with a value less than a given level, x. 
Note that if f is this proportion of loops, then f(x) is the cumulative 
distribution function. The ratio estimate of f can be written 
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NET 
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South Central Bell 


Southwestern Bell 
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Michigan Bell 
Northwestern Bell 


New York Tel 


Wisconain Rell 


Table V—Summary of 1980 survey sites 


Wire Center 
Harwich, Massachusetts 
Coventry, Connecticut 
Burlington, New Jersey 


Sharpsville, Pennsylvania 
Philadelphia, Pennsylvania 


Keedysville, Maryland 
Montgomery, West Virginia 
Oakhill, West Virginia 
Danville, Virginia 


Pleasant Garden, North Carolina 
Gulf Breeze, Florida 


Yazoo City, Mississippi 
Ripley, Tennessee 

Biloxi, Mississippi 

New Orleans, Louisiana 
Hale Center, Texas 
Tyler, Texas 

Campbell, Missouri 

West Memphis, Arkansas 
Tulsa, Oklahoma 


El Toro, California 
Gardena, California 


Dallas, Oregon 
Springville, Utah 
Phoenix, Arizona 
Twin Falls, Idaho 
Bolingbrook, Illinois 
Uhrichsville, Ohio 
Lansing, Michigan 


Council Bluffs, Iowa 

East Soderville, Minnesota 
Mobridge, South Dakota 
Omaha, Nebraska 


Albany, New York 
Clarksville, New York 


Rinhmand Uliannansin 


CLLI 
HRWCMAMA 
CNTYCTOO 
BURLNJBU 


SRVLPASH 
PHLAPAPE 


KDVLMDKV 
MTGMMVMG 
OKHLWVCH 
DAVLVADA 


GNBONCPL 
GLBRFLMA 


YZCYMSMA 
RPLYTNMA 
BILXMSED 
NWORLAMA 


HLCTTXHC 
TYLRTXCH 
CMPBMOCH 
WMMPARMA 
TULSOKFI 


ELTRCA11 
GRDNCAO01 


DLLSOR58 


SPVLUTMA 
PHNXAZMA 
TWFLIDMA 


BGBKILBK 
UHVLOH92 
LNNGMIMN 


CNBLIADT 
SDVLMNSO 
MBRGSDCO 
OMAHNEIZ 


ALBYNYSS 
CLVLYNCK 


TVONER TINT - 


Environment 


Suburban 
Suburban 
Suburban 


Suburban 
Urban 


Rural 
Rural 
Suburban 
Suburban 


Suburban 
Suburban 


Rural 
Rural 
Suburban 
Urban 


Rural 
Rural 
Rural 
Suburban 
Suburban 


Suburban 
Urban 


Rural 


Suburban 
Urban 
Suburban 


Suburban 
Suburban 
Suburban 


Suburban 
Rural 
Rural 
Urban 


Urban 
Rural 


Test Dates 

4/28-5/2 
5/5-5/9 

2/18-2/22 


4/21-4/25 
12/1-12/5 


3/17-3/21 
4/7-4/11 

3/24-3/28 

3/31-4/4 


11/17-11/21 
11/10-11/14 


10/20-10/24 

10/13-10/17 

10/27-10/31 
11/3-11/7 


9/8-9/11 
9/15-9/19 
9/29-10/3 
10/6-10/10 
9/22-9/26 


8/18-8/22 
8/11-8/15 


8/4-8/8 
7/21-71/25 
8/25-8/29 
7/28-8/1 
5/19-5/23 
4/14-4/18 
5/12-5/16 
6/16-6/20 
5/26-5/30 
6/9-6/13 
6/23-6/27 
12/8-12/10 
12/11-12/16 


ny = (x vinta) mi (2) 
and 
dj = (x vin) my, (3) 


where the sum over index i is a sum over the strata, the sum over 
index j is a sum over offices in a stratum, and the sum over k is a sum 
over loops in an office. The proportion of assigned pairs in stratum i 
is w;. The number of loops measured in office j of stratum i is mj. The 
number of offices in stratum 1 is n;. 

The parameters u;, allow for the determination of f for any pre- 
specified subpopulation of tested loops, e.g., long or short loops. If 
the tested loop is a member of the subpopulation, u;, = 1; otherwise 
ijk = 0. The special case of a ratio to size estimator assumes all tested 
loops are to be considered and wu; = 1 for all loops. 

If the mean of a parameter is required, then xj, is equal to the 
measured value of the parameter. If the proportion of loops with a 
parameter less than x is required, then x;, = 1 if the measured value is 
less than x, and x, = 0 if the measured level is greater than or equal 
to x. 

For a given stratum the statistic corresponding to f is 


d i 


fi = Fa: i= 1, 2, 3. (4) 
ij 
j 


At the office level the statistic corresponding to f is 
fg = niy/dy. (5) 


This report considers only the variance of the ratio-to-size estimator. 
For this case the variance is”® 


v*(f) = (wi)’vi( fr) + (we)?v2( fo) + (we)*v3(fa), (6) 


where vu?(f;) is the variance of f;, i = 1, 2, 3. 
The variance of f; is 


1 
nny 


vf) = 7 Bb Ge f| (7) 


where n; is the number of offices in stratum 1. 
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Fig. 29—Main station C-message weighted noise to ground for urban, suburban, 
rural, and Bell System loop populations. 


As an illustration, Fig. 29 shows the distribution of the C-message 
noise to ground at the main station for the urban, suburban, rural, 
and Bell System environments. As is the case in Fig. 29, Bell System 
distributions for noise parameters generally lie between the distribu- 
tions for urban and suburban environments with a larger variance 
than either. 

Table VI presents statistics relevant to the median and 90th per- 
centile of the rural and Bell System C-message weighted noise to 
ground. Two statistically equivalent confidence intervals are consid- 
ered.”> One confidence interval assumes that the quantile, x, is given 
and that a 90-percent confidence interval for the estimate of the 
percentile, f(x), is to be calculated. This confidence interval is sym- 
metrically located with respect to the estimate of the percentile and 
has a length of twice 1.67 times the sdem, where the sdem is the square 
root of the variance, v?. The second confidence interval assumes that 
the percentile is given and that a 90-percent confidence interval for 
the quantile is to be determined. Since this confidence interval is not 
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Table VI—C-message weighted noise-to- 
ground statistics 


(a) Median C-Message Noise to Ground 


Bell System 


Rural Loops Loops 
Quantile 69 dBrnC 61 dBrnC 
LLCI* 67 dBrnC 60 dBrne 
ULCI* 70.5 dBrnC 62 dBrnC 
CIP? +5% +3% 


(b) 90th Percentile C-Message Noise to Ground 


Bell System 


Rural Loops Loops 

Quantile 85 dBrnC 80 dBrnC 
LLCI* 84 dBrnC 78.5 dBrnC 
ULcI* 86.5 dBrnC 80.5 dBrnC 
CIP? +2% +1.5% 

* Lower limit of 90% confidence interval for quan- 
tile. 

t Upper limit of 90% confidence interval for quan- 
tile. 


* 90% confidence intervals for percentile. 


necessarily symmetric, upper and lower limits of the interval must be 
given. 

Table VI indicates that for assigned pairs in rural environments, 
the 90th percentile of the C-message weighted noise to ground at the 
main station is 85 dBrnC. The confidence interval on the percentile 
means that for the given noise level of 85 dBrnC, the estimate of the 
percentile would have fallen between 88.7 and 91.3 percent for 90 
percent of the possible samples that could be chosen by the sampling 
scheme described here. Equivalently, for the given percentile of 90 
percent, the estimate of the noise level would have fallen between 78.5 
and 80.5 dBrnC for 90 percent of possible samples. 
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This paper examines the accuracy of telephone-usage-demand forecasts 
produced by three state-of-the-art forecasting techniques: Box-Jenkins, 
Akaike state space, and an autoregressive spectrum estimation. The study 
considers 35 actual monthly demand time series and 300 simulated realiza- 
tions. Principal results are that: (1) correct identification of the nonstationary 
behavior of telephone demand is crucial to forecast performance, (2) overpar- 
ameterization or underparameterization of the stationary aspects of a process 
has little or no impact on the accuracy of the forecast, and (3) forecasts based 
on the naive random-walk method compare favorably to those produced by 
sophisticated techniques. Also, strengths and weaknesses of the investigated 
techniques are revealed through data analysis. It is further argued that the 
traditional method for assessing the accuracy of the usage forecast based on 
average busy season quantities is biased towards underforecasting. 


I. INTRODUCTION 


Errors in forecasting per-telephone usage demand at the switching- 
machine level can be quite costly for the Bell operating companies. It 
has been estimated that a net 3-percent overforecast would yield an 
increase in capital requirements of about $800 million.’ The result of 
underforecasting is degraded service. 

The usage variable to be projected is known as the hundred call 
seconds per main station (CCS/M).' This variable is required to 


* AT&T Bell Laboratories. 
tOne CCS is one hundred call seconds of usage per hour and is equal to 1/36 Erlang. 
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determine central office relief timing and sizing strategies. In this 
regard forecasts are needed up to five years ahead. 

Historically, the most commonly used CCS/M forecasting technique 
in the Bell operating companies has been linear regression trending. 
The result has often been a large forecast error. In this paper, we 
examine three state-of-the-art time-series techniques. These are the 
Autoregressive Integrated Moving Average (ARIMA) class of models 
and the Box-Jenkins philosophy, a state space method developed by 
Akaike, and a robust autoregressive spectrum estimation and factori- 
zation approach. __ 

The Box-Jenkins technique provides a procedure for model build- 
ing.” But, because the procedure requires user judgment, it can lead 
different users to identify different models for the same set of data. 
The question naturally arises as to what situations lead to different 
identified models and what impact this has on forecasting accuracy. 

On the other hand, attempts have been made to eliminate judgment 
by developing procedures that automatically determine the order of 
the most representative model. The state space approach presented in 
this paper and its information criterion (known as the AIC) have been 
advocated to do just that, assuming that the given process is station- 
ary.® : 

Furthermore, efforts were spent in seeking procedures that ma 
circumvent the identification stage entirely. The nonparametric 
method for estimating the spectrum, and hence the coefficients of 
autoregressive models, is a result of these efforts. It is examined here 
since it may provide a viable alternative.* 

As a first step in assessing the three techniques, we selected 35 time- 
series data on CCS/M from more than 100 series that were available. 
In this selection we avoided time series that were contaminated with 
many outliers or with large jumps. The selection was strictly based on 
visual inspection of the time-series graphs. 

Limiting the investigation to “clean” data does not hinder the 
significance of its findings. Our primary objective is to answer the 
question: Is there a method better than a naive one that can improve 
the CCS/M forecast even for the most regular offices? Moreover, 
concerning outliers, steps have been taken to guarantee better data- 
collection procedures via reliable mechanization. As for jumps, we 
were able to associate a causative factor with each jump observed in 
the data. Such information ought to be incorporated in the forecasting 
mechanism. 

Results obtained from applying these techniques to the selected 
time series show that none of the techniques was able to provide a 
marked improvement in forecast accuracy—even for 1 year ahead— 
when compared to a simple-minded forecast that uses data from 1 
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year past. Moreover, the models identified via the state space approach 
were different from those identified by the Box-Jenkins technique. 
Existing literature concerning the performance of the AIC criterion in 
identifying the order of a pure autoregressive model indicates that the 
AIC has a tendency to overparameterize.° Our results show that the 
criterion mostly underparameterizes and is inconsistent in general. 

As a first impression, one may attribute these findings to many 
factors. Some possibilities include the short history data, outliers, 
failure to reduce the data to stationary time series, nonconstant 
variance of the noise process and/or the applicability of the criterion 
used in comparing forecasting accuracy of the various methods. 

The Bell operating companies measure of forecast accuracy for the 
CCS/M is the percent difference between a forecast and an actual for 
a quantity known as Average Busy Season (ABS). The ABS is defined 
as the average of the highest three observations in a year. Forecast 
accuracy is only relevant for ABS values. It is the ABS forecast that 
is used for planning and engineering. 

A simulation experiment was conducted to verify and explain results 
obtained from analyzing the CCS/M data. Three typical processes 
identified from the CCS/M time series were chosen. In the simulation, 
one hundred realizations were generated from each of the three proc- 
esses. Using the simulated data to assess the various approaches, 
findings similar to those obtained from analyzing actual data were 
disclosed. These findings, however, provided additional insight into a 
number of aspects of the problem. 

This paper is organized as follows. Section II outlines the three 
investigated techniques. Section III discusses our experience in apply- 
ing these techniques to the CCS/M data, and illustrates, where pos- 
sible, the effect of overfitting, underfitting, and misspecification of 
model parameters on the forecast accuracy. Section IV describes the 
simulation experiment and assesses the performance of the three 
approaches in an artificially structured ideal environment for the CCS/ 
M processes. Section V summarizes the results and outlines a number 
of observations that were noted during the investigation. 


ll. METHODOLOGY 


The three investigated approaches for the analysis and forecasting 
of the CCS/M variable are all claimed to be powerful. They are also 
complicated and expensive (in varying degrees). They depend heavily 
on elaborate software packages that are not normally available in 
standard computer systems. 

Our reason, then, for investigating these approaches was to under- 
stand the properties of the data in order to suggest a simple method. 
Hypothetically, using state-of-the-art methods would allow for quan- 
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tifying the maximum forecast accuracy that can be achieved, or the 
maximum loss that may be incurred due to simplification. 

In the next three subsections, we outline these methods. The follow- 
ing presentation assumes that the reader is familiar with autoregres- 
sive and moving average processes. 


2.1 The Box-Jenkins approach for seasonal data 


For seasonal time series of period S Box and Jenkins suggest 
representation by members of the following class 


p(B) p(B*)(1 — B)*(1 — BS)”(% — nu) = 0,(B)Og(BS)a,, (1) 


where 
Zz; = the time series data at time t 
p. = the location parameter 
B=the backshift operator such that 
Bz, = 24-+ 
a, = independent random variable with 
mean zero and variance o2, known 
as white noise 
d and D = the degrees of differencing required 
for achieving process stationarity 
¢p(B), &p(B), 64(B) and @z(B) = polynomials in B of order p, P, gq, 
and Q, respectively. 
The class of models in (1) is known as multiplicative seasonal Auto- 
regressive Integrated Moving Average (ARIMA) of order (p, d, q) X 
(P ’ D ’ Q)s. 
The Box-Jenkins approach for fitting models of the form (1) involves 
a three-part cycle of identification, estimation, and diagnostic check- 
ing. The cycle is ended once an adequate model is derived. The basic 
tools for model identification are two descriptive functions known as 
the sample Autocorrelation Function (ACF) and the sample Partial 
Autocorrelation Function (PACF). Nonstationarity of a series is iden- 
tified from the behavior of its ACF. For nonstationary series, the serial 
correlation in the ACF remains large at large lags. In this situation, 
the integrated part of the ARIMA representation allows for differenc- 
ing to induce stationarity. Once stationarity is achieved, the problem 
is to select reasonable values for p, g, P, and Q. This is also done by 
examining the shape of the ACF (and PACF) of the stationary series. 
In the estimation stage, the parameters ¢’s, ®’s, 6’s, and ©’s of the 
identified model are calculated on the basis of the minimization of the 
sum of squared errors {a;}. 
After the model is fitted, the residuals {a,} are checked for whiteness. 
If no pattern is detected in the autocorrelations of residuals, the 
assumption that the a,’s are white noise is accepted. If, on the other 
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hand, a pattern is observed, such a pattern would provide information 
on how to modify the model. In the latter. case, the identification, 
estimation, and diagnostic checking cycle is repeated. 

We note from (1) that the Box-Jenkins approach provides a capa- 
bility for representing a wide class of nonstationary time series. 
Through differencing an appropriate number of times, the nonstation- 
ary behavior can be removed. The use of differencing, however, re- 
quires judgment. Nevertheless, we adopted this technique to transform 
all 35 CCS/M time-series data before employing any of the other 
approaches. These other approaches deal only with the representation 
of stationary processes. But, we tested the impact of adopting this 
technique on the results via the simulation experiment. 


2.2 State space approach 


While the state space approach can accommodate multivariate 
systems, for the purpose of this study, we address the univariate case 
only. From control theory, a linear, discrete-time, time-invariant sys- 
tem can be represented by 


V4 = Fur + gue+i 
xX, = hu, (2) 


where v; is an r X 1 state vector, and F, g, h are r X r, r X 1, and 
1 X r matrices, respectively. This representation assumes x; (scalar) 
to be the output and u, to be the input of the system. 

Akaike showed that, when x; and u; are stochastic Gaussian proc- 
esses an analogous representation to (2) exists.&’ The new represen- 
tation is called Markovian and can be obtained from the analysis of 
canonical correlations between the set of the present and future output 
and the set of the present and past input. We consider in this paper 
the case where there is a feedback from the output to the input, that 
is, when u; = x;. This Markovian representation takes the form 


Ver = Fu; + gars 
xX, = hu, (3) 
where a4; is the innovation of x, at time ¢ + 1. It is defined by 


Atty = Xe41 — Xet1]t- (4) 


Here {a,} is white noise with autocovariance c, = E(a;-a;_,). Also, xt+1)¢ 
is the projection of x;4; onto the linear space [R(t—)] spanned by the 
components of the present and the past (x;, %:-1, ---). It is the one- 
step-ahead predictor of x;. The space that is spanned by the compo- 
nents of the predictors x;4;;, is called the predictor space [R(¢+ | t—)]. 
The components of the state vector, v;, are elements of this space, 


TELEPHONE USAGE FORECASTING 823 


which is assumed to have a finite dimension. In this regard v; provides 
a specification of the predictor space. 

Since any basis of the predictor space can play a role of the state 
vector, F, g, and h in (3) can have different structures. However, a 
state vector whose elements are the first maximum set of independent 
components of the predictor space has the smallest possible dimension. 
This “minimal” representation defines a canonical representation of 
the system. Its dimension is equal to the number of nonzero canonical 
correlation coefficients. We are interested in the canonical represen- 
tation because it corresponds to the concept of parsimony in the 
ARMA representation, and hence to the problem of identifiability of 
the process. 

To show the relation between the Markovian representation and 
the ARMA representation, suppose an ARMA model for x; is given in 
the form 


Xt + @1Xt-1 + eee + bpXt-p = Qa + 05 Qp-1 ie §,Qr—¢ (5) 
or 
o(B)x, = 0(B)a. 


The zeros of ¢(B) and 6(B) are constrained to lie outside the unit 
circle to ensure stationarity and invertibility of the process. Hence x; 
can be expressed as 


xt = z Wisi, (6) 
where 
Yo=1 and 5 Vy? < , 
Since xt+ij¢ = Xi for i = 0, —1, —2, --- and a,4;= 0 fori=1, 2, ---, 


eq. (5) can be expressed by 
Xetije F OrXesi-rje F +++ 1+ hpXesi-pie 
= Arie + 6, Qt+i-1)¢ a sae af OgAt+i-g\t- (7) 


For 1 = q + 1, the right-hand side of (7) vanishes. Thus for r => max 
(p, gq + 1), (7) becomes 


Xetrje = —PiXt+r-1je — PoXter-aje — +2 + — hrXeje, (8) 
where 
o; =0 for r> p. 


Also, from (6) one can get 
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Nevitde+1 — Xe+itile = WiQe41. (9) 


From (8) and (9) one can see that the state vector vz = (XstXt+1)2 --- 
Xt+r-1\t)’ provides a Markovian representation in the form given by 
(3), where 


0 1 0 

0 0 0 
F = 

0 0 1 

—¢, —r-1 oo —o1 
gs=[1 ho ee) Yel’ 
h=[1 0 --- O}’. 


Here, the y’s are obtained from the relation 
O= Wit i-rt +++ + oWi-p, t= 1,2,---,Q, 
where 
yy; =0 for j<0O. 


Thus, given an ARMA representation of a stationary process, one can 
derive a Markovian representation. Similarly, it can be shown that 
any stationary process that has a Markovian representation has an 
ARMA representation. Also, the minimal Markovian representation— 
known as canonical representation—corresponds to a parsimonious 
ARMA model. 

In practice, the canonical representation needs to be identified from 
a finite length of a time series. The theoretical canonical correlation 
coefficients are replaced by the sample canonical correlation coeffi- 
cients. A criterion is required to determine how small the sample 
canonical correlation coefficients should be to be judged zero. An 
information criterion (AIC) was suggested by Akaike in this regard. 
The AIC criterion is defined by 


AIC = —2(maximum log likelihood) + 2(p + q). 


An approximation of the likelihood function is calculated from the 
sample autocovariances. The log likelihood measures the goodness of 
model fit. The term 2(p + q) gives preference to a structure with the 
least number of free parameters. It was argued that the canonical 
structure has Minimum AIC (MAICE). There is no theoretical proof 
of the optimality of the MAICE in this context however. 

A procedure for automatically identifying the canonical structure 
from time-series data is described in Ref. 8. In this procedure, the 
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amount of past to be used in the canonical-correlation analysis is 
determined from fitting a sequence of autoregressive models. It is 
equal to the order of the model that has MAICE. The canonical 
correlations between the past data and the future with increasing 
number of steps are then calculated. These calculations determine a 
structure of the state vector by which variables that yield large 
correlations are included. Usually, the provided structure of the state 
vector is not necessarily adequate.’ Besides, estimates provided by this 
type of analysis are poor. However, this structure and the estimated . 
parameters may be used as initial guesses. Computation of the likeli- 
hood function for this structure and all possible choices of the state 
vector are then performed. The MAICE structure is hence selected. 
For maximum likelihood estimation, an approximation was devel- 
oped since direct maximization of the likelihood function for each 
possible structure is a formidable task. The developed algorithm is 
based on Davidon’s variance algorithm,’ which requires the evaluation 
of the gradient and the inversion of the approximate Hessian of the 
log likelihood. Our experience with the application of this approach to 
telephone usage data will be further discussed in Section 3.3. 


2.3 Autoregressive spectrum estimation approach 


This approach involves the fitting of an autoregressive process of 
order p—where p is unknown—with the coefficients being estimated 
in the frequency domain. The power spectral density function and its 
factorization are used to estimate model parameters. The order, p, 
may be determined using a stopping rule criterion, such as Parzen’s® 
or Akaike’s.”° The following is a brief account of an autoregressive 
spectral estimation approach. The reader is also referred to the work 
in Refs. 9, 11, 12, and 13. 

Let x, be a zero mean, discrete time, stationary stochastic process 
with autocovariance C, at lag 7. Then 


C, = cov(x:, een ; (10) 


The power spectral density function f(w) at frequency w is defined as 
1 . 
fo) =— Ce“, lol <7. (11) 
Qn 7 
Since {x,} can be represented by an autoregressive process in the form 
DY Xj = ae, (12) 
j=0 


where 


édo=1- and a, ~ N(0, o2), 
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the spectral density of the process is 


—2 
a2 


Qn 


fo) 


x ger 


j=0 


2 
Oa 


~ BafACoP” oe 








f(w) = 
where 
A(w) = X gje” 


is the transfer function of ¢;’s. 

For a known f(w) and an infinite series x;, factorization of the 
spectrum provides values for the parameters ¢;s. A procedure for 
factorization of f(w) that can determine model coefficients is described 
in Ref. 4. 

In practice, however, the spectral density function is unknown and 
there is only a finite realization, say of length T, of a time series. The 
concept of a “windowed” estimate for f(w) is used for carrying out the 
factorization and subsequently the estimation of ¢;’s. The following 
discussion outlines the method. Briefly, 

1 T-r-1 
Cr=— YY xuxe, O<7T<T (14) 
t=0 
gives the sample covariance function. An estimation of the spectral 
function can be expressed by 


TEN a 1 ar T)\ -ire 
fllw) = D5 C! (2) < | (15) 
where p is the truncation point that gives the order of the model, and 
R(v) is called the lag window. Usually, f7(w) and subsequently ¢,’s are 
estimated at N points equally spaced in (0, 7). The equation f7(w,), 
w; = 27t/2N for (t = 0, --- , N — 1) is called the “windowed” estimate 
of f(w). 

The spectral analysis and estimation of model parameters are done 
by three applications of the Fast Fourier Transform (FFT) algorithm.” 
The software for carrying out this analysis is currently available in 
the Statistical Computing Library (STATLIB).“ 

Our use of the analysis in the frequency domain is limited to the 
estimation of the model parameters (¢;, j = 1, ---, p). The advantage 
of this approach is that the problem of model identification can be 
avoided for the stationary part of the process. 


Ill. APPLICATION OF METHODOLOGY TO THE CCS/M DATA 


Central office CCS/M monthly data were sampled for 35 switching 
entities. Preliminary analysis of these data was performed to check if 
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the requirements on the methodology used were met. Data transfor- 
mations were employed where needed. Forecasting models from each 
of the three techniques were fitted to the transformed data for the 
period between May 1969 and April 1975. Forecasts were then gener- 
ated for the next 24 months and converted to ABS values. The percent 
forecast error—which is defined as the percent difference between 
actual and forecast for ABS values—for 1 and 2 years ahead were 
hence calculated. Percent forecast errors were also calculated for the 
random-walk method, which considers the future ABS of a given series 
to be its most recent ABS value. This naive method is selected to serve 
as a benchmark in assessing forecast performance for the other com- 
pared methods. 


3.1 Preliminary data analysis 


As a first step in the analysis, we plotted the raw data for each of 
the time series. Figures 1 through 3 show time-series graphs for three 
of the series (A, B, C). We consider the behavior exhibited in Figs. 1 
and 2 to be typical of the CCS/M data and that of Fig. 3 to be less 
typical. The sample autocorrelation function for Series A is also shown 
in Fig. 4. Preliminary assessments of the time-series graphs indicate 
(1) there are obvious periodicities at lag 12, (2) apparent upward or 
downward trends can be located in few series (Figs. 2 and 3), (3) for 
series that are not contaminated with outliers and/or jumps, station- 
arity can be attained by differencing the data at lag 12, and (4) 
transformation of differenced data is unnecessary. Further tests indi- 
cate that the assumption of process linearity can be accepted in 
general. 

As mentioned above, the method adopted to induce stationarity is 
the first step in the identification process of the Box-Jenkins proce- 
dure. By applying the appropriate differencing, a time zeries {z,} can 
be converted to a stationary series {x,}. Our analysis showed that 
homogeneous nonstationarity of the data can be removed by removing 
the seasonal component (Figs. 4 and 5). Thus, each of the CCS/M 
series was differenced at lag 12 prior to the application of the fore- 
casting techniques. 


3.2 Application of Box-Jenkins technique 


Forecasting models for the selected time series, with varying char- 
acteristics of the CCS/M variable, were individually identified using 
the Box-Jenkins approach. In the following, we briefly discuss the 
application of the method to Series A (Fig. 1). From the autocorrelation 
function (Fig. 5) of the differenced series {x;,} 


% = (1 — BY )z, (16) 
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. Fig. 1—Monthly data on CCS/M, Series A. Visible seasonal variation—typical be- 
avior. 


we note that {x,} is not a white noise process since there is still useful 
predictive information that can be further explained. As a formal test 
of hypothesis on the randomness of x,, we calculated the statistic Q as 


36 
Q=n Y ri(x) = 78, 
k=1 


where r; is the kth autocorrelation coefficient of x, and n (= 60) is the 
number of points in the differenced series. 
For x; to be white noise, at the 5-percent level 


Qcont < x3.95(36) = 50. 


Since Q > Qeonr, we considered modifications of the model in (16). We 
note, from the sample autocorrelation function of {x,}, that the corre- 
lation coefficient at lag 12 is large. This suggests a moving average 
term of order 12. The autocorrelation pattern suggests a low-order 
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Fig.2—Monthly data on CCS/M, Series B. Apparent trend variation for peak 
months—typical behavior. 


autoregressive term. Thus we considered the model 


(1 — ¢B)x, = (1 — OB")a;. (17) 
After estimation, (17) becomes 
(1 — 0.63B)(1 — B')z, = (1 — 0.47B?”)a;,, (18) 


where o, = 0.12. Further inspection of the autocorrelation function of 
the residuals (Fig. 6) suggested no further modification [Q = x7(34) = 
18]. 

3.2.1 Alternative ARMA models 

Models other than those individually identified by the Box-Jenkins 
procedure for the examined series were investigated. Alternative 
models were considered in an attempt to understand the effect of 


overfitting, underfitting, differencing, and modeling a spurious linear 
trend on the forecast of CCS/M. For the sake of space, these effects 
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Fig. 3—Monthly data on CCS/M, Series C. Atypical behavior. 


will be illustrated graphically by showing the data, and a model fit and 
its extrapolation, over two to four years for some selected alternatives. 

To test the effect of overfitting an MA(1) term to the previously 
identified form for Series A, consider 


(1 — ¢B)(1 — B™)x, = (1 — 6B)(1 — OB™”)a;. (19) 


The above model specification was examined, since it was identified 
for several other CCS/M series, using the Box-Jenkins procedure. 
After estimation of model parameters, (19) becomes 


(1 — 0.61B)(1 — B')z, = (1 + 0.05B)(1 — 0.47B”)a,, (20) 
where 


6,= 0.12, Q=19.0. 
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Fig. 4—Estimated autocorrelation, Series A. 


The estimated value for the overfitting parameter, 6, is not signifi- 
cantly different from zero, and the goodness-of-fit measure does not 
indicate inadequacy. And, as expected, this model generates forecasts 
that are identical to those generated by the original model (18). 

Now, consider Series B (Fig. 2). The originally identified Box- 
Jenkins model for this series is 


(1 — 0.79B)(1 — B*)z, 
= (1 — 0.33B)(1 — 0.50B”)a,, Gg = 0.12. (21) 
Three alternatives are discussed. First 
(1 — @B)(1 — ®B™”)z, = (22a) 
with estimated parameters 


¢ = 0.99, ® = 0.33, dq = 0.12. 
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Fig. 5—Estimated autocorrelation, Series A, differenced at Lag 12. 
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Fig. 6—Estimated autocorrelation of residuals, Series A. 


This alternative is considered to determine if the original model, (21), 
is overdifferencing the time series and, if so, the impact on its forecast. 
While both models, (21) and (22a), do not indicate inadequacy from 
the fitting point of view, their forecasts are markedly different. Figures 
7 and 8 give an impression of the forecast performance for these 
models. It is clear that for model (22a) the seasonality in the forecast 
damps out quickly. This implies considerable underforecasting of peak 
months—and consequently of ABS values—for the CCS/M data. 
The two other alternatives for Series B were suggested from its 
graph (Fig. 2). The identified Box-Jenkins model for this series, after 
differencing for seasonality, represents a stationary process, despite 
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Fig. 7—Data, fitting, and forecast, Series B, model (21). 


an apparent linear downward trend. This apparent trend is modeled 
as short-term fluctuation and is represented by ARMA(1, 1) parame- 
ters. If this trend is to be extended into the future, one must consider 
other representations. For example, 


(1 — B)(1 — B™)z, = (1 — 6B)(1 — OB” )a, (22b) 
where 
6=05, 0=0.53, o. = 0.12, 
and 
(1 — B')z, = 0 + (1 — 6B)(1 — OB")a,, (22c) 
where 


6 = 0.04, 6=-032, O=056, o,=0.12. 
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Fig. 8—Data, fitting, and forecast, Series B, model (22a). 


Model (22b) simply suggests an exponential smoothing for the year- 
to-year variation followed by another exponential smoothing for the 
month-to-month variation. Model (22c) includes a deterministic con- 
stant to account for a slope, considered a useful alternative to differ- 
encing. It is more restrictive, however, in the sense that it assumes a 
common slope for the 12 parallel lines (one for each month) connecting 
different years. 

Based on the autocorrelation function of the residuals and on the 
goodness of fit tests, models (22b) and (22c) may represent Series B. 
From the forecast point of view, however, the trend effect was much 
exaggerated by model (22b) and somewhat exaggerated by model (22c) 
(Figs. 9 and 10). The original model did react to the apparent trend 
[through the AR(1) term], but was more conservative in its extrapo- 
lation (Fig. 7). Because of our experience with the CCS/M data, we 
favor the original model. We have learned from the behavior of many 
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Fig. 9—Data, fitting, and forecast, Series B, model (22b). 


time series for CCS/M that a turning point will eventually occur. We 
emphasize here that one must be careful in allowing for a linear trend 
representation in a model unless one is certain that this behavior has 
a physical interpretation. An apparent linear trend may arise simply 
from the repeated summation of independent disturbances. 


3.3 Application of state space approach 


Akaike et al. implemented the state space method into a computer 
package.® The original package appeared in 1974 (known as TIMSAC- 
74) and has been revised several times since. The version we employed 
here was revised in March 1977.* More recently, the method was 
implemented by SAS in a procedure called STATESPACE.”® It is 


*A copy of TIMSAC programs was obtained from the Mathematical Sciences Division 
of the University of Tulsa, Oklahoma. The university distributes these programs for the 
authors. 
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Fig. 10—Data, fitting, and forecast, Series B, model (22c). 


assumed that models are limited to the block identifiability condition.’ 


In this regard, an identified form using this procedure is equivalent to 
an ARMA form of order (p, p — 1). 

Using TIMSAC programs, models were identified for 15 of the 35 
series that were previously identified by the Box-Jenkins approach. 
While preliminary analysis of these data (Section 3.1) indicated that 
for the differenced series {x,} there is a dependency between x, and 
Xt-12 (—0.47 for Series A), which suggests a moving average term of 
order 12, none of the automatically identified models included this 
term. 

We also noted that the numerical procedure for the maximum 
likelihood computation did not converge easily. It required a large 
number of iterations. The problem was due to the improper initial 
values—estimated by the canonical correlation analysis—of model 
parameters. In many cases, these values specified noninvertible 
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models. For six of the series, numerical breakdowns were encountered 
as a result of the singularity of the Hessian of the mean log-likelihood 
function. At the breakdown points, the MAICE model for all series 
was noise, ARMA(0, 0). As for the other nine series, six were identified 
as noise [ARMA(0, 0)], two as MA(1) and one as an AR(1). One of 
the series that was identified as noise is the differenced data for Series 
A. Neither the sample autocorrelation function of this data (Fig. 4) 
nor the goodness-of-fit measure is compatible with such representa- 
tion. We note here that the identified model from the canonical 
correlation analysis for this series is an AR(1) with ¢ = 0.55 and 
AIC = —314. The competing model ARMA(0, 0) was preferred because 
it has AIC = —316. 


3.4 Application of autoregressive approach 


Until now, we have been discussing model identification and forecast 
performance while disregarding the data problems of gross outliers 
and large jumps. Since these problems can be held accountable for 
forecast inaccuracies, the idea of eliminating them from the data prior 
to modeling seemed plausible. 

Two procedures for removing jumps were implemented. The first is 
based on a moving median technique to locate discontinuities in the 
data; and the other uses a robust regression method to remove the 
discontinuities found.’® The latter procedure replaces the time series 
points before a jump by new values at the most recent level of the 
process. It also substitutes suspected single outlying points by more 
probable values. 

After modifying the 35 investigated time series using these algo- 
rithms, we applied the autoregressive spectrum estimation approach 
to the cleaned data. A detailed description of this approach and its 
implementation is given in Ref. 15. Since the considered series are 
fairly short (less than 250 points), their forecasts were generated 
from parametric autoregressive models, with their parameters esti- 
mated in the frequency domain. The procedure applied here uses the 
Fast Fourier Transform (FFT) algorithm to estimate the spectrum. 
The bias that may occur in these estimates due to the limited length 
of the series was handled by using Tukey’s “twicing.”*” 

The ABS forecast accuracies obtained using this approach were not 
significantly different from those found by the random-walk method 
even for the 1 year ahead, in spite of the screening of the data, the 
removal of jumps, and the robust estimation of models parameters. 


IV. SIMULATIONS 


Our objectives in carrying out the simulation study are first to 
determine whether any of the examined methodology can automati- 
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cally identify the actual process in an ideal environment. A second 
objective is to understand the relationship between over- or underpar- 
ameterization of a stationary process and model adequacy. A third 
objective is to assess the forecast performance of the various methods 
and determine if a use of the “best” method is worthwhile. The three 
simulated processes are 


ProcessI (1 —0.6B)(1 — B*)z, 

= (1 — 0.5B™)a,, o, = 0.12 
Process II (1 — 0.8B)(1 — B"™)z, 

= (1 — 0.3B)(1 — 0.5B™)a,, o, = 0.12 
Process III (1 — B’”)z;=aQ, oq = 0.25. 


Processes I and II correspond to the two previously identified models 
for Series A and Series B, respectively. Process III is a seasonal 
random-walk process (where B is replaced by B’*), which was also 
observed in many of the time-series data. 

For each of the three processes, 100 realizations were generated with 
252 observations in each. The white noise {a,} was generated from a 
normal random variate with mean zero and variance equal to those 
estimated from the actual data. The starting values of these realiza- 
tions (21, 22, ---, 213) were taken from the actual CCS/M time series. 
In all realizations, the first 96 points were discarded, the next 120 
points were used for fitting, and the last 36 observations were saved 
for ex-post analysis. In the following we discuss the simulation results 
in terms of our three objectives. 


4.1 Identification 


In this discussion, we exclude the Box-Jenkins approach since its 
identification stage is based on judgment, while the processes to be 
identified are known a priori. The state space procedure and the 
autoregression spectrum estimation procedure were applied to the 
simulated realizations after differencing at lag 12. Table I gives sum- 
mary statistics on model identification of time series generated from 
Processes I and II using the state space procedure. The tabulated 
statistics are the number of realizations an ARMA(p, q) model is 
identified, where p and q are given at certain values and/or specified 
ranges. From these results we note that the procedure did not recognize 
the distant terms (coefficients of a;-12 and a;-13) except for one series. 
For this series, however, the identified model was ARMA (138, 12), 
with nonzero estimates for all 25 coefficients. Further inspection of 
Table I also indicates that the state space procedures did not identify 
correct parsimonious ARMA models. 

We must also indicate that, as with actual data, we encountered the 
same computational difficulties. These difficulties were, in major part, 
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Table |I—Frequency distribution of the 
order of the fitted process using Akaike 
state space procedures 


Frequency 
Fitted Order ProcessI Process II 

p=q=0 19 14 
p=1;q=0 16 9 
p=1;q=1 10 18 
p=0;q=1 0 0 
p=2;q=1 15 8 
p=2;q=2 2 8 
p=0,1;q=2 2 2 
3c ae q)=s5 26 29 
6 <= max(p, gq) = 11 6 8 
max(p, gq) = 12 3 4 
max(p, q) = 13 1 0 

Total 100 100 


connected with the numerical approximations for the maximization . 
of the likelihood function. The singularity of the Hessian was due to 
improper assumed values for model parameters. A large number of 
iterations was required for convergence in most cases. 

On the autoregressive spectrum estimation procedure, Table II 
presents the goodness of fit of its models to the simulated realizations. 
Since the procedure assumes pure autoregressive models, we will not 
discuss whether the assumed order over- or underparameterizes a 
given series. Instead, we compare the estimated value for the standard 
error of the residual (cas in Table II) for each of the simulated series 
with the true value of its process. An average estimate of 0.13 (com- 
pared to o, = 0.12) for the standard error for Processes I and II 
realizations indicates that the assumed models may account for most 
of the predictive information in the simulated data. 


4.2 Effect of overparameterizations 


To demonstrate the effect of overparameterization, we used (19), 
denoted hereafter by S, to model each of the 300 simulated series. This 
model specification was developed as a result of Box-Jenkins analysis 
of actual CCS/M time series. Table II tabulates the estimated values 
for the associated parameters (Gs, ¢1, 61, @12). Inspection of the 
estimated standard error (és) indicates that these models can account 
for the predictive variabilities of the time series (average o, = 0.12 for 
Processes I and II). To compare the estimated values of model coeffi- 
cients to their true values, however, one must understand the redun- 
dancy between autoregressive and moving average terms in a mixed 
model. For simplicity, consider the form 


(1 — $B)»: = (1 — 6B)a,. (23) 
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Table Ila—Standard errors for AS, RW, and S models and estimates 
of the S model parameters, with o. = 0.12—Process | realizations 





This can be written as 
ye = [1 + (¢ — 0)B + (¢” — 06)B* + (¢° — 067)B? + --- Jar. (24) 


For ¢ = 6 = 0, (23) represents a noise process. But when ¢ = 6 € 0 it 
still represents a noise process, since all terms of the polynomial (24) 
cancel out. 

For the model specification (19), similar redundancy exists between 
the AR(1) coefficient, ¢, and the MA(1) coefficient, 0, and to a lesser 
degree it exists between ¢ and the moving average coefficients at lags 
12 and 13. The effect of these redundancies is particularly clear from 
the estimated coefficients of Process III realizations (¢ = 6 #0). 


TELEPHONE USAGE FORECASTING 841 


Table Ilb—Standard errors for AS, RW, and S models and estimates 
of the S model parameters, with a. = 0.12—Process II realizations 


Series 





Based on these results, we conclude that (19) can represent the 
generated processes adequately. In other words, overparameterization 
by a moving average term at lag 1, as in Process I, and furthermore 
by an autoregressive term at lag 1 and moving average terms at lags 
12 and 13, as in Process III, did not affect the standard error of the 
residuals significantly. 


4.3 Forecast accuracy 


For this evaluation, we consider forecasts generated by: (a) the 
model specification in (19) (S), (b) the autoregressive spectral esti- 
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Table Ilc—Standard errors for AS, RW, and S models and estimates 
of the S model parameters, with o. = 0.25—Process III realizations 





oRW Ss $ 6 © |j\Series cas srw 3 r) 6 .S) 


mation method (AS) and, (c) the random-walk model (RW) for the 
simulated series. 

While minimization of the mean square errors during the model- 
fitting phase guarantees minimum mean square errors for the future 
of the simulated data (post-sample phase), we consider ex-post analysis 
to assess the effect of conversion of the monthly forecast to ABS 
values on forecast accuracy. 

The ABS quantity is made up of the average of the three highest 
months in a year. The busy season months can vary from one year to 
the next. A nonbusy season month of one year may become a busy 
season month of the next year as a result of some added white noise. 
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Fig. 11—Notched box plot for one-year-ahead forecast error—Process I realizations. 
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Fig. 12—Notched box plot for two-year-ahead forecast error—Process I realizations. 


Since the effect of this aggregation is not a simple analytical problem, 
we resort to ex-post analysis of the simulated realizations for expla- 
nation. 

Figures 11 through 16 present these comparisons in the form of box 
plots* of the percent forecast accuracy’ for series that are grouped by 


*In a box plot the middle line is the median and the upper and lower lines of the box 
are the upper and lower quartiles, respectively. The box whiskers are drawn out to the 
nearest values within 1.5 quartiles. Points outside the whiskers are considered outliers 
and are indicated by asterisks. The notches in the side of a box represents a rough 95- 
percent confidence interval for the median. 
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Fig. 13—Notched box plot for one-year-ahead forecast error—Process II realizations. 
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Fig. 14—Notched box plot for two-year-ahead forecast error—Process II realizations. 


their process type for one and two years ahead. Based on these graphs, 
we draw a general conclusion that there are no significant differences 
in the forecasting accuracies among the three methods. 

We did notice, however, that an optimal model that represents the 
actual process, and is expected to provide best forecasts, results in low 
prediction of the ABS values. More specifically, the S form is the 
model specification for Process II. Thus, it is expected to generate 
“best” forecasts. Figures 13 and 14, however, indicate a bias towards 
underforecast of ABS values for optimal models. The estimated me- 
dians of the percent errors for one- and two-years-ahead forecasts are 
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Fig. 15—Notched box plot for one-year-ahead forecast error—Process III realizations. 
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Fig. 16—Notched box plot for two-year-ahead forecast error—Process III realizations. 


—1.0 and —0.7, respectively. This observation is also repeated for 
Process III. The random-walk model underforecasts ABS values for 
realizations generated by the random-walk process. Figures 15 and 16 
demonstrate this observation. The estimated medians for one and two 
years ahead were —1.0 and —2.4, respectively (notice that here 
oq = 0.25). In this regard, the ABS forecast is not optimal. 

In the following we give a heuristic argument on the above obser- 
vation and explain why the ABS forecast is biased. When the ABS 
forecast is made at time t, (ABS,), it is conventionally derived from 
the monthly forecast by 
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Table III—Simulation results. Frequency method A outperforms 
Method B. There are 100 realizations generated from each process 








Method B 
Method Process 1 Year Ahead 2 Years Ahead 3 Years Ahead 

A No. S AS RW S AS RW S AS RW 
S — 57 52 — 62* 58 — 50 47 
AS I 43 — 45 37 — 40 49 — 39 
RW 45 40 —_ 39 45 — 51 43 — 
S — 51 51 — 58 52 — 50 43 
AS II 46 — 45 39 — 42 50 — 43 
RW 45 38 — 45 41 — 54 41 — 
S — 41 32 — 52 36 — 39 42 
AS Ill 50 — 41 42 — Al 45 — 45 
RW 46 48 — 44 43 — 44 Al —_— 


* Indicates significance at 5-percent level. 


ABS, = ABS(Z,, 241) ee 2t4+12). : 


Suppose z, is white noise and the maximum forecast for the next k 
months is required. The best forecast is the expected value of the 
maximum of k-order statistics from the normal distribution, which is 
a positive number. On the other hand, the maximum of the predictions 
Of 241, 242, °°, 2t+e 1S zero. This observation indicates the bias in 
ABS. However, one can adjust for this bias since it can be estimated 
directly from the simulation results. 

For further evaluation of forecast accuracy, Table III gives the 
frequency that Method A outperforms Method B. This table was 
generated using a sign test.'® This test does not take into account the 
magnitude of the error. 


V. SUMMARY AND CONCLUSIONS 


Through extensive data analysis of actual and simulated time series 
we answered, in this paper, numerous crucial questions: specific ques- 
tions concerning the selection of a best univariate method for fore- 
casting telephone usage demand, and general questions on the weak- 
nesses and strengths of the three state-of-the-art techniques. 

On the specific questions, our results indicate that none of the 
investigated techniques outperformed a naive random walk in fore- 
casting the ABS CCS/M variable. By means of simulation results we 
show that there is no gain in using optimal models (models that 
represent the exact process) for generating point forecasts over a 
random-walk model for lead times of more than one year. Moreover, 
we pointed out a bias in the ABS measure which can be adjusted for, 
using the simulation result. 

Concerning the general issues, we demonstrated that identification 
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of the nonstationary behavior is critical to the forecast accuracy. It is 
this behavior that has a long-lasting effect on a process. While accu- 
racy is sensitive to this identification, proper techniques are lacking. 
Of the three investigated methods, only the Box-Jenkins approach 
gives some guidelines, but it requires user’s judgment. 

The simulation results also indicate that the AIC criterion did not 
identify parsimonious models for the investigated processes. They 
point out that the nonconvergence of the numerical procedures for the 
state space approach can add considerable complexities and can in- 
crease computer costs—all of which must be considered when assessing 
the feasibility of the method as an automatic forecasting technique. 
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